Volatility is one of the standard measures of risk in financial markets. From a statistical point of view, volatility is the annualized standard deviation of the yield of an underlying asset. Technically, there are two kinds of volatilities; Historical volatility which is obtained through historical data on an index, and Implied Volatility which is estimated using statistical models and option prices. Implied Volatility shows the expectations of the market on the volatility of the underlying asset from now until the date of expiry. In this project, we try to model the implicit volatility, using statistical models. More precisely, we try to estimate a specific kind of Implied Volatility which is VIX. The CBOE Volatility Index, or VIX, represents the market's expectations for volatility over the coming 30 days or 60 days. Investors use the VIX to measure the level of risk, fear, or stress in the market when making investment decisions. Once a rigid estimate of VIX has been made, we try to create signals which tell when to open/close a short position. Finally, we try to obtain a value that represents our \"Confidence\" when deciding to invest when some specific in presence specific volatility conditions.
In this section, we discuss the theoretical framework which is essential to carry out the project.
Options are contracts that give the bearer the right, but not the obligation, to either buy or sell an amount of some underlying asset at a predetermined price at or before the contract expires.
Depending on the situation, there is usually an option scenario appropriate for an investor’s goal. A popular example would be using options as an effective hedge against a declining stock market to limit downside losses.\ Options belong to the larger group of securities known as derivatives. A derivative's price is dependent on or derived from the price of something else.
Options are a type of derivative security. An option is a derivative because its price is intrinsically linked to the price of something else. If you buy an options contract, it grants you the right but not the obligation to buy or sell an underlying asset at a set price on or before a certain date.
A call option gives the holder the right to buy a stock and a put option gives the holder the right to sell a stock. Think of a call option as a down payment on a future purchase and a put option as an insurance policy.
There are four things you can do with options:
1. Buy (long) calls
2. Sell (short) calls
3. Buy (long) puts
4. Sell (short) puts
Buying stock gives you a long position. Buying a call option gives you a potential long position in the underlying stock. Short-selling a stock gives you a short position. Selling a naked or uncovered call gives you a potential short position in the underlying stock.
Buying a put option gives you a potential short position in the underlying stock. Selling a naked or unmarried put gives you a potential long position in the underlying stock. Keeping these four scenarios straight is crucial.
People who buy options are called holders and those who sell options are called writers of options.
from IPython.display import Image
Image('figures/put option.png', width=800)
Brownian motion describes movements of a particle which are caused by small shocks from other particles. The particle shows two types of movements:
1.drift: as it is depicted, this movement causes the particle to go away from the first position.
2.fluctuation: the uncertainty or the noise of the movement, that sometimes can change the whole direction of drift movements.
$${dx}= {adt}+{bdz}$$
Image('figures/p.png', width= 400)
Volatility often refers to the amount of uncertainty or risk related to the size of changes in a security's value. A higher volatility means that a security's value can potentially be spread out over a larger range of values. This means that the price of the security can change dramatically over a short time period in either direction. It is calculated as standard deviation from expected price.
If we scrutinize stock price trajectories, we understand that the stock prices follow the uncertain fashion as well.
A widely used model for the stock price behaviour is the Geometric Brownian Motion (GBM) model of stock price:
or in its more formal shape: $$ \frac{dS}{S} = \mu dt+\sigma dz$$
S: Stock Price
$\mu$: Stock's expected rate of return
$\sigma$ : Volatility of the Stock Price
Which, From the point of view of physics, is called a Generalized Wiener Process. a process in which The mean change per unit time for a stochastic process is known as the drift rate and the variance per unit time is known as the variance rate.
Notice: The stock price itself does not follow a generlaized wiener process. we need to normalize the change in stock price at each time w.r.t the stock price in that time. In other words, the variable that satisfies the generalized wiener process equation is the change of stock price in time t, over the stock price in the same time, which is called Return:
For each variable like x that follows brownian motion, a function like C, which is a function of x and time, follows brownian motion as well:
$${dx}= {adt}+{bdz}$$$$ {dC}=\displaystyle \bigg [\frac{\partial C}{\partial x}{a}+\frac{\partial C}{\partial t}+ {1/2}\frac{\partial^{2} C}{\partial x^{2}}{b^{2}}\bigg]{dt} + \frac{\partial C}{\partial x}{b}{dz} $$If we go back to option pricing again, C can be fitted to call option price which is a function of stock price (or the underlying asset) and time. This results to the fact that call option prices follow brownian motion too.
In 1997, Scholes, Black, and Merton were awarded the Nobel Price because of finding a function that not only does it well describe the call option pricing based on the stock price at current time, but also it is a perfect solution to Ito Lemma's equation.
In the Derivation of Black-Scholes formula, there are some basic assumptions about the market that had been taken into account:
No dividends are paid out during the life of the option.
Markets are random(market movements cannot be predicted).
There are no transaction costs in buying the option.
The volatility of the underlying asset is constant.
The option is European.
where :
$${d}_{1}=\frac{\ln\frac{{S}_{t}}{K} + (r + \frac{{\sigma}_{v}^2}{2})t}{{{\sigma}_{s}}{\sqrt {t}}} $$and
$$ {d}_{2}= {d}_{1} - {{\sigma}_{s}}{\sqrt {t}} $$where :
one of the well-accepted strategies to show the market wellbeing is to plot volatility. stock market scientists usally use Black and Scholes model to derive the volatility rather than the option price itself. option price is usually derived from other models and used in Black and Sholes model to plot Implied Volatility.
The volatility obtained from model above is constant for every maturity time, which does not square with reality.
That is why we introduce a finer model with more parameters to introduce a volatility that changes over time and gives the investers a better clue about the changes of option prices.
from IPython.display import Image
Image('figures/m.png')
The Heston Model, developed by associate finance professor Steven Heston in 1993, is an option pricing model that can be used for pricing options on various securities. Quite opposite to the Black and Scholes model, This model uses a type of volatility (Stochastic Volatility) that changes over time.
stochastic means the volatility is quite arbitary and random, this means in Heston model we are going to consider a distribution function for deriving this random numbers. this random numbers does not depend on past experiences theonly thing that they are dependent on is the current number. Therefore, we can conclude our volatility follows brownian motion as well.
and :
$$ {dV}_{t} = k({\theta}- {V}_{t}){dt} + {\sigma}{\sqrt {V}_{t}{dW}_{2t}} $$the aforementioned volatility follows stochastic process, it means it is randomely chosen from a distribution function with expectation of ${k}({\theta} - {V}_{t}){dt}$ and STD of ${\sigma}{\sqrt {V}_{t}}{dW}_{2t}$ .
The main goals of the project is listed as below:
In the following is reported an analysis of the volatility index (VIX) concerning the Standard & Poor’s 500, the most important North American stock index and the main equity benchmark for Wall Street listed stocks, in particular we made a parallel anaysis for the first (UX1) and second future (UX2).
Our analysis is based on the Heston Model, that we used to compute the predicted volatility for the following day, then we implement two different investing strategy and compute the profit for both the two strategies.
To implement the Heston Model we used two different techiques to discretize the differential equation: the Euler method and the Milstein method.
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import seaborn as sns
import pandas as pd
import numpy.random as npr
from mpl_toolkits.axes_grid1.inset_locator import inset_axes,zoomed_inset_axes, mark_inset
from matplotlib.ticker import AutoMinorLocator, MultipleLocator, FuncFormatter
import matplotlib.dates as mdates
from datetime import datetime, date, timedelta
from scipy import optimize
from scipy import stats
from scipy import integrate
import datetime
import copy
import warnings
warnings.filterwarnings('ignore')
import csv
import sklearn as sl
import numdifftools as nd
from sklearn import datasets
from sklearn import linear_model
from scipy.optimize import minimize
from mpl_toolkits.mplot3d import axes3d
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import GridSearchCV
npr.seed(12345)
%matplotlib inline
We used the function read_excel of the pandas library to create two dataframe for the datasets containing the first and the second future values of the VIX.
Then, since we have a value for each date we set the date itself as index of the dataframe.
dir = 'data/'
file_name = 'grid1_zheb51fo.xlsx'
UX1 = pd.read_excel(dir+file_name, sheet_name='UX1_Index')
UX2 = pd.read_excel(dir+file_name, sheet_name='UX2_Index')
UX1 = UX1.set_index('Date')
UX2 = UX2.set_index('Date')
UX1.sort_index(inplace=True)
UX2.sort_index(inplace=True)
UX1
| PX_LAST | PX_VOLUME | |
|---|---|---|
| Date | ||
| 2010-01-04 | 24.8500 | 3138 |
| 2010-01-05 | 24.4500 | 1936 |
| 2010-01-06 | 23.5500 | 1958 |
| 2010-01-07 | 23.1500 | 1760 |
| 2010-01-08 | 22.5500 | 2050 |
| ... | ... | ... |
| 2022-01-12 | 20.9713 | 76675 |
| 2022-01-13 | 22.0944 | 82700 |
| 2022-01-14 | 21.7462 | 102849 |
| 2022-01-17 | NaN | 0 |
| 2022-01-18 | 21.7800 | 5655 |
3034 rows × 2 columns
We downloaded historical data of SPX price found at https://www.wsj.com/market-data/quotes/index/SPX/historical-prices.
file_name = 'SPX_HistoricalData.csv'
SPX_price = pd.read_csv(dir+file_name)
SPX_price['Date'] = pd.to_datetime(SPX_price['Date'],format='%m/%d/%y')
SPX_price = SPX_price.set_index('Date')
SPX_price.sort_index(inplace=True)
SPX_price
| Open | High | Low | Close | |
|---|---|---|---|---|
| Date | ||||
| 2010-04-01 | 1171.23 | 1181.43 | 1170.69 | 1178.10 |
| 2010-04-05 | 1178.71 | 1187.73 | 1178.71 | 1187.44 |
| 2010-04-06 | 1186.01 | 1191.80 | 1182.77 | 1189.44 |
| 2010-04-07 | 1188.23 | 1189.60 | 1177.25 | 1182.45 |
| 2010-04-08 | 1181.75 | 1188.55 | 1175.12 | 1186.44 |
| ... | ... | ... | ... | ... |
| 2022-01-10 | 4655.34 | 4673.02 | 4582.24 | 4670.29 |
| 2022-01-11 | 4669.14 | 4714.13 | 4638.27 | 4713.07 |
| 2022-01-12 | 4728.59 | 4748.83 | 4706.71 | 4726.35 |
| 2022-01-13 | 4733.56 | 4744.13 | 4650.29 | 4659.03 |
| 2022-01-14 | 4637.99 | 4665.13 | 4614.75 | 4662.85 |
2970 rows × 4 columns
fig,ax = plt.subplots(figsize=(15,10))
ax.plot(UX1.index, UX1.PX_LAST, label='UX1', color='blue')
ax.plot(UX2.index, UX2.PX_LAST, label='UX2', color='red')
ax.legend(loc='upper center', ncol=2)
ax.grid(linestyle='dotted')
ax.annotate('European Debt Crisis', xy = ( date(2011,2,1), 43 ), xytext = ( date(2011,2,1), 50 ),
ha='center', va='center',
arrowprops=dict(arrowstyle='-[, widthB=6, lengthB=2' ) )
ax.annotate( "COVID crisis", xy = ( date(2020,3,10), 68 ), xytext = ( date(2020,10,1), 60 ),
arrowprops = dict( arrowstyle = "->" ) )
# An inner plot to zoom
axes1 = plt.axes( [0.35, 0.45, 0.35, 0.35] )
axes1.set_title( 'Zoom' )
axes1.plot( UX1.index, UX1.PX_LAST, color='blue' )
axes1.plot( UX2.index, UX2.PX_LAST, color='red' )
axes1.grid( linestyle = 'dotted' )
axes1.set_xlim([date(2018, 1, 1), date(2019, 1, 1)])
axes1.set_ylim(bottom = 10, top = 30)
axes1.tick_params(axis='x', labelrotation=45)
plt.show()
fig,ax = plt.subplots(nrows=2,figsize=(15,10), sharex=True)
ax[0].plot(SPX_price.index, SPX_price.Close, label='Price', color='blue')
ax[0].legend(loc='upper center', ncol=2)
ax[0].grid(linestyle='dotted')
ax[0].set_title('Price of S&P500 stocks')
ax[0].set_ylabel('Price [$]')
ax[1].plot(UX1.index, UX1.PX_LAST, label='UX1', color='blue')
ax[1].plot(UX2.index, UX2.PX_LAST, label='UX2', color='red')
ax[1].legend(loc='upper center', ncol=2)
ax[1].grid(linestyle='dotted')
ax[1].set_title('VIX index')
ax[1].set_ylabel('Volatility')
plt.show()
fig, ax1 = plt.subplots(figsize = ( 15, 10 ) )
color = 'tab:red'
ax1.set_xlabel('')
ax1.set_ylabel('Price SPX', color=color)
ax1.plot( SPX_price.index, SPX_price.Close, label='Price', color=color, lw=2)
ax1.tick_params(axis='y', labelcolor=color)
ax1.grid(linestyle='dotted',color=color)
ax1.xaxis.set_major_formatter(mdates.DateFormatter("%b %y"))
ax1.xaxis.set_minor_formatter(mdates.DateFormatter("%b %y"))
ax1.set_title('VIX vs. Asset Price - COVID-19 Crisis', fontsize=14)
ax1.set_xlim([date(2020, 1, 1), date(2020, 6, 1)])
ax1.set_ylim([2000, 3500])
ax1.legend(loc='upper center', bbox_to_anchor=(0.55, 1))
ax2 = ax1.twinx()
color = 'tab:blue'
ax2.set_ylabel('VIX value', color=color)
ax2.plot(UX1.index, UX1.PX_LAST, label='UX1', color=color, ls='solid', lw=3)
ax2.plot(UX2.index, UX2.PX_LAST, label='UX2', color=color, ls='dashed', lw=3)
ax2.tick_params(axis='y', labelcolor=color)
ax2.legend(loc='upper center', bbox_to_anchor=(0.45, 1))
ax2.grid(linestyle='dotted', color=color)
#ax2.set_ylim([14, 45])
fig.autofmt_xdate()
fig.tight_layout() # otherwise the right y-label is slightly clipped
plt.show()
We immediately see how a spike in the VIX values corresponds generally to a fall in SPX prices.
The Heston model is described by the bivariate stochastic process for the stock price $S_t$ and its variance $v_t$
$$ {dS}_{t} = {r}{S}_{t}{dt} + \sqrt {V}_{t} {S}_{t} {dW}_{1t}$$$$ {dV}_{t} = k({\theta}- {V}_{t}){dt} + {\sigma}{\sqrt {V}_{t}{dW}_{2t}}$$where $E[dW_{1,t}, dW_{2,t}] = \rho dt$
We simulate St over the time interval $[0;T]$, which we assume to be is discretized as $0 = t_1 < t_2 < ... < t_m = T$, where the time increments are equally spaced with width dt. Integrating dSt from $t$ to $t + dt$ produces
$$S_{t+dt} = S_t + \int^{t+dt}_t \mu(S_u, u) du + \int^{t+dt}_t \sigma(S_u, u) dW_u $$The previous equation is the starting point for any discretization scheme. At time $t$, the value of $S_t$ is known, and we wish to obtain the next value $S_{t+dt}$.
This is equivalent to approximate the integrals using the left-point rule. Hence the first integral is approximated as the product of the integrand at time $t$, and the integration range $dt$
$$\int^{t+dt}_t \mu(S_u, u) du \simeq \mu(S_t, t) \int^{t+dt}_t du = \mu(S_t,t)dt $$In an identical fashion, the second integral is approximated as
$$ \int^{t+dt}_t \sigma(S_u, u) dW_u \simeq \sigma(S_t, t) \int^{t+dt}_t dW_u = \sigma(S_t, t) (W_{t+dt} - W_t) = \sigma(S_t, t) \sqrt{dt}Z $$with Z standard distributed variable.
The SDE for $v_t$ in the Heston model in integral form is
$$ v_{t+dt} = v_{t} + \int_t^{t+dt}\kappa(\theta - v_u)du + \int_t^{t+dt}\sigma\sqrt{v_u}dW_{2,u} $$which becomes
$$v_{t+dt} = v_t + \kappa(\theta-v_t)dt+\sigma \sqrt{v_t dt}Z_v$$On the other hand, using the Ito's Lemma we obtain
$$ S_{t+dt} = S_t \cdot \exp{\left(\left(r - \frac{1}{2}v_t\right)dt + \sqrt{v_t dt}Z_s\right)}$$where it is ecessary to replace $v_t$ with $|v_t|$ to avoid negative varinaces.
The key to the Milstein scheme is that the accuracy of the discretization is increased by considering expansions of the coefficients $\mu_t = \mu( St )$ and $\sigma_t = \sigma( S_t )$ via Itō’s lemma.
We obtain
$$v_{t+dt} = v_t + \kappa(\theta-v_t)dt+\sigma \sqrt{v_t dt}Z_v + \frac{1}{4}\sigma^2dt(Z_v^2 - 1)$$$$ S_{t+dt} = S_t \cdot \exp{\left(\left(r - \frac{1}{2}v_t\right)dt + \sqrt{v_t dt}Z_s\right)}$$We start by selecting a common period of time for the three sets of data.
start = '2010-04-05'
stop = '2022-01-10'
mask = ( UX1.index >= start ) & ( UX1.index < stop )
UX1 = UX1[ mask]
mask = ( UX2.index >= start ) & ( UX2.index < stop )
UX2 = UX2[ mask]
mask = ( SPX_price.index >= start ) & ( SPX_price.index < stop )
SPX_price = SPX_price[mask]
UX1 = UX1.drop(UX1.drop(SPX_price.index).index)
UX2 = UX2.drop(UX2.drop(SPX_price.index).index)
def heston_pde_euler( S_0, r, k, theta,
v_0, rho, sigma, steps, Npaths ):
'''Euler scheme for Heston model PDE'''
dt = 1 / 252 / steps
size = ( Npaths, steps )
prices = np.zeros( size )
volatility = np.zeros( size )
S_t = np.full ( Npaths, S_0 )
v_t = np.full ( Npaths, np.abs( v_0 ) )
for t in range( steps ):
WT = np.sqrt( dt ) * np.random.multivariate_normal( np.array( [0, 0] ), np.array( [[1, rho], [rho, 1]] ), size=Npaths )
#WT = np.sqrt( dt ) * np.random.normal( 0, sigma , size = Npaths )
S_t = S_t * np.exp( ( r - 0.5 * v_t / 100 ) * dt + np.sqrt( v_t / 100 ) * WT[:, 0] )
v_t = np.abs( v_t + k * ( theta - v_t ) * dt + sigma * np.sqrt( v_t ) * WT[:, 1] )
volatility[:,t] = v_t
prices[ :,t] = S_t
return prices, volatility
def heston_pde_milstein( S_0, r, k, theta,
v_0, rho, sigma, steps, Npaths ):
'''Milstein scheme for Heston model PDE'''
dt = 1 / 252 / steps
size = ( Npaths, steps )
prices = np.zeros( size )
volatility = np.zeros( size )
S_t = np.full ( Npaths, S_0 )
v_t = np.full ( Npaths, np.abs( v_0 ) )
for t in range( steps ):
WT = np.sqrt( dt ) * np.random.multivariate_normal( np.array( [0, 0] ), np.array( [[1, rho], [rho, 1]] ), size=Npaths )
#WT = np.sqrt( dt ) * np.random.normal( 0, sigma , size = Npaths )
S_t = S_t * np.exp( ( r - 0.5 * v_t / 100 ) * dt + np.sqrt( v_t / 100 ) * WT[:, 0] )
v_t = np.abs( v_t + k * ( theta - v_t ) * dt + sigma * np.sqrt( v_t ) * WT[:, 1] - .25 * sigma**2 * ( WT[:,1]**2 - dt ) )
volatility[:,t] = v_t
prices[ :,t] = S_t
return prices, volatility
For each value of VIX in $t$ we want to compute the value in $t+dt$. We avoid going further because Heston model is just a model and it can't predict the behaviour of VIX too far, since real world can't be approximated with a simple model too far from the beginning.
Here we evaluate which is the best number of values to be considered for the parameters calibration, we do a search for $n\in[3,30]$. The parameters we evaluate are the following:
$\theta$: the long variance, or long-run average variance of the price; as $t$ tends to infinity, the expected value of νt tends to $\theta$;
$\kappa$: the rate at which $v_t$ reverts to $\theta$;
$\sigma$: the volatility of the volatility, which determines the variance of $v_t$.
We find that, using a Maximum Likelihood Estimation (MLE):
$$ \kappa = -\frac{\log\beta_1}{\delta} $$$$ \theta = \beta_2 $$$$ \sigma^2 = \frac{2\kappa\beta_3}{1-\beta_1^2} $$with
$$\beta_1 = \left|\frac{n^{-2}\sum_{k=1}^nV_k\sum_{k=1}^nV_{k-1}^{-1}-n^{-1}\sum_{k=1}^nV_kV_{k-1}^{-1}}{n^{-2}\sum_{k=1}^nV_{k-1}\sum_{k=1}^nV_{k-1}^{-1}-1}\right|$$$$\beta_2 = \left|\frac{n^{-1}\sum_{k=1}^nV_kV_{k-1}^{-1}-\beta_1}{(1-\beta_1)n^{-1}\sum_{k=1}^nV_{k-1}^{-1}}\right|$$$$\beta_3 = \left|n^{-1}\sum_{k=1}^n\left[V_k - V_{k-1}\beta_1 - \beta_2(1-\beta_1)^2\right]V_{k-1}^{-1}\right|$$We want to calibrate the number of previous days n to use to get the best results in terms of a Loss function, given by:
$$ L = \left(VIX_{pred}-VIX_{real}\right)^2 $$T1 = 1 / 12
T2 = 1 / 6
steps = 1
Npaths = 100
delta = 1 / 252
min = 3
max = 30
window = np.arange( min, max )
rho = -.7890
r = 0.0217
volat1_e = np.zeros( shape = ( max - min, UX1.index.size, Npaths ) )
volat2_e = np.zeros( shape = ( max - min, UX2.index.size, Npaths ) )
price1_e = np.zeros( shape = ( max - min, SPX_price.index.size, Npaths ) )
price2_e = np.zeros( shape = ( max - min, SPX_price.index.size, Npaths ) )
vol1_mean_e = np.zeros( shape = ( max - min, UX1.index.size ) )
vol2_mean_e = np.zeros( shape = ( max - min, UX2.index.size ) )
price1_mean_e = np.zeros( shape = ( max - min, SPX_price.index.size ) )
price2_mean_e = np.zeros( shape = ( max - min, SPX_price.index.size ) )
volat1_m = np.zeros( shape = ( max - min, UX1.index.size, Npaths ) )
volat2_m = np.zeros( shape = ( max - min, UX2.index.size, Npaths ) )
price1_m = np.zeros( shape = ( max - min, SPX_price.index.size, Npaths ) )
price2_m = np.zeros( shape = ( max - min, SPX_price.index.size, Npaths ) )
vol1_mean_m = np.zeros( shape = ( max - min, UX1.index.size ) )
vol2_mean_m = np.zeros( shape = ( max - min, UX2.index.size ) )
price1_mean_m = np.zeros( shape = ( max - min, SPX_price.index.size ) )
price2_mean_m = np.zeros( shape = ( max - min, SPX_price.index.size ) )
for n in window:
print( '... processing window', n, '...' )
prices1_e = np.empty( ( SPX_price.index.size - n, Npaths ) )
prices2_e = np.empty( ( SPX_price.index.size - n, Npaths ) )
vol1_e = np.empty( ( UX1.index.size - n, Npaths ) )
vol2_e = np.empty( ( UX2.index.size - n, Npaths ) )
prices1_m = np.empty( ( SPX_price.index.size - n, Npaths ) )
prices2_m = np.empty( ( SPX_price.index.size - n, Npaths ) )
vol1_m = np.empty( ( UX1.index.size - n, Npaths ) )
vol2_m = np.empty( ( UX2.index.size - n, Npaths ) )
for i in range(n,UX1.index.size):
Vt1 = UX1.PX_LAST.values[ i-n:i]**2
Vt2 = UX2.PX_LAST.values[ i-n:i]**2
S = SPX_price.Close.values[ i-n:i]
V10 = Vt1[-1]
V20 = Vt2[-1]
S0 = S[-1]
b1 = np.abs( ( ( 1 / n**2 * sum( Vt1[i ] for i in range( 1, n ) ) * sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) -
( 1 / n * sum( Vt1[i ] * Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) ) /
( 1 / n**2 * sum( Vt1[i-1] for i in range( 1, n ) ) * sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - 1 ) )
b2 = np.abs( ( 1 / n * sum( Vt1[i ] * Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - b1 ) /
( sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) * ( 1 - b1 ) / n ) )
b3 = np.abs( 1 / n * sum( ( Vt1[i] - Vt1[i-1] * b1 - b2 * ( 1 - b1 )**2 ) / Vt1[ i - 1 ] for i in range( 1, n ) ) )
k1 = - np.log( b1 ) / delta
theta1 = b2
sigma1 = np.sqrt( 2 * k1 * b3 / ( 1 - b1**2 ) )
b1 = np.abs( ( ( 1 / n**2 * sum( Vt2[i ] for i in range( 1, n ) ) * sum( Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) -
( 1 / n * sum( Vt2[i ] * Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) ) /
( 1 / n**2 * sum( Vt2[i-1] for i in range( 1, n ) ) * sum( Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - 1 ) )
b2 = np.abs( ( 1 / n * sum( Vt2[i ] * Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - b1 ) /
( sum( Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) * ( 1 - b1 ) / n ) )
b3 = np.abs( 1 / n * sum( ( Vt2[i] - Vt2[i-1] * b1 - b2 * ( 1 - b1 )**2 ) / Vt2[ i - 1 ] for i in range( 1, n ) ) )
k2 = - np.log( b1 ) / delta
theta2 = b2
sigma2 = np.sqrt( 2 * k2 * b3 / ( 1 - b1**2 ) )
p1e, v1e = heston_pde_euler( S0, r, k1, theta1, V10, rho, sigma1, steps, Npaths )
p2e, v2e = heston_pde_euler( S0, r, k2, theta2, V20, rho, sigma2, steps, Npaths )
p1m, v1m = heston_pde_milstein( S0, r, k1, theta1, V10, rho, sigma1, steps, Npaths )
p2m, v2m = heston_pde_milstein( S0, r, k2, theta2, V20, rho, sigma2, steps, Npaths )
prices1_e[i-n,:] = p1e[:].ravel( )
prices2_e[i-n,:] = p2e[:].ravel( )
vol1_e[ i-n,:] = v1e[:].ravel( )
vol2_e[ i-n,:] = v2e[:].ravel( )
prices1_m[i-n,:] = p1m[:].ravel( )
prices2_m[i-n,:] = p2m[:].ravel( )
vol1_m[ i-n,:] = v1m[:].ravel( )
vol2_m[ i-n,:] = v2m[:].ravel( )
vol1_e = np.insert( vol1_e, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
vol2_e = np.insert( vol2_e, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
prices1_e = np.insert( prices1_e, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
prices2_e = np.insert( prices2_e, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
vol1_m = np.insert( vol1_m, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
vol2_m = np.insert( vol2_m, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
prices1_m = np.insert( prices1_m, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
prices2_m = np.insert( prices2_m, 0, np.full( ( n, Npaths ), np.nan ), axis = 0 )
volat1_e[ n - min, :, : ] = np.sqrt( vol1_e )
volat2_e[ n - min, :, : ] = np.sqrt( vol2_e )
price1_e[ n - min, :, : ] = prices1_e
price2_e[ n - min, :, : ] = prices2_e
volat1_m[ n - min, :, : ] = np.sqrt( vol1_m )
volat2_m[ n - min, :, : ] = np.sqrt( vol2_m )
price1_m[ n - min, :, : ] = prices1_m
price2_m[ n - min, :, : ] = prices2_m
vol1_mean_e[ n - min, : ] = np.sqrt( vol1_e.mean( axis = 1 ) )
vol2_mean_e[ n - min, : ] = np.sqrt( vol2_e.mean( axis = 1 ) )
price1_mean_e[ n - min, : ] = prices1_e.mean( axis = 1 )
price2_mean_e[ n - min, : ] = prices2_e.mean( axis = 1 )
vol1_mean_m[ n - min, : ] = np.sqrt( vol1_m.mean( axis = 1 ) )
vol2_mean_m[ n - min, : ] = np.sqrt( vol2_m.mean( axis = 1 ) )
price1_mean_m[ n - min, : ] = prices1_m.mean( axis = 1 )
price2_mean_m[ n - min, : ] = prices2_m.mean( axis = 1 )
... processing window 3 ... ... processing window 4 ... ... processing window 5 ... ... processing window 6 ... ... processing window 7 ... ... processing window 8 ... ... processing window 9 ... ... processing window 10 ... ... processing window 11 ... ... processing window 12 ... ... processing window 13 ... ... processing window 14 ... ... processing window 15 ... ... processing window 16 ... ... processing window 17 ... ... processing window 18 ... ... processing window 19 ... ... processing window 20 ... ... processing window 21 ... ... processing window 22 ... ... processing window 23 ... ... processing window 24 ... ... processing window 25 ... ... processing window 26 ... ... processing window 27 ... ... processing window 28 ... ... processing window 29 ...
We plot the Loss as function of the number n of days used to calibrate the model: we see immediately that there is no significant difference between Milstein and Euler schemes.
fig, ax = plt.subplots( nrows = 2, figsize=( 15, 12 ) )
ax[0].plot( np.arange( min, max ), sum( ( vol1_mean_e[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) ), label = 'Euler scheme' )
ax[0].plot( np.arange( min, max ), sum( ( vol1_mean_m[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) ), label = 'Milstein scheme' )
ax[0].set_xlabel('# Window used')
ax[0].set_ylabel('Total residual')
ax[0].set_title( 'Loss as function of #days used for calibration' )
ax[0].grid(linestyle='dotted')
ax[0].legend(loc = 'upper right')
# An inner plot to zoom
axes1 = plt.axes( [0.175, 0.70, 0.2, 0.2] )
axes1.set_title( 'Zoom' )
axes1.plot( np.arange( 0 + min, 16 + min ), sum( ( vol1_mean_e[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )[0:16] )
axes1.plot( np.arange( 0 + min, 16 + min ), sum( ( vol1_mean_m[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )[0:16] )
axes1.grid( linestyle = 'dotted' )
axes1.set_ylim( [9800,12000] )
axes1.set_xlim( [5,8] )
axes1.xaxis.set_major_locator(MultipleLocator(1))
ax[1].plot( np.arange( min, max ), sum( ( vol2_mean_e[ :, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) ), label = 'Euler scheme' )
ax[1].plot( np.arange( min, max ), sum( ( vol2_mean_m[ :, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) ), label = 'Milstein scheme' )
ax[1].set_xlabel('# Window used')
ax[1].set_ylabel('Total residual')
ax[1].set_title( 'Loss as function of #days used for calibration' )
ax[1].grid(linestyle='dotted')
ax[1].legend(loc = 'center right')
# An inner plot to zoom
axes2 = plt.axes( [ 0.175, 0.30, 0.2, 0.2 ] )
axes2.set_title( 'Zoom' )
axes2.plot( np.arange( 5 + min, 16 + min ), sum( ( vol2_mean_e[ :, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) )[5:16] )
axes2.plot( np.arange( 5 + min, 16 + min ), sum( ( vol2_mean_m[ :, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) )[5:16] )
axes2.grid( linestyle = 'dotted' )
axes2.set_ylim( [7300,9250] )
axes2.set_xlim( [10,13] )
axes2.xaxis.set_major_locator(MultipleLocator(1))
plt.show()
index_min_1e = np.argmin( sum( ( vol1_mean_e[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) ) )
index_min_2e = np.argmin( sum( ( vol2_mean_e[ :, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) ) )
index_min_1m = np.argmin( sum( ( vol1_mean_m[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) ) )
index_min_2m = np.argmin( sum( ( vol2_mean_m[ :, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) ) )
print( 'Window used that gives minimum distance with Euler scheme for UX1:\t\t', index_min_1e + min)
print( 'Window used that gives minimum distance with Euler scheme for UX2:\t\t', index_min_2e + min)
print( 'Window used that gives minimum distance with Milstein scheme for UX1:\t\t', index_min_1m + min)
print( 'Window used that gives minimum distance with Milstein scheme for UX2:\t\t', index_min_2m + min)
print( '\n\n' )
loss_1e = sum( ( vol1_mean_e[ index_min_1e, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )
loss_1m = sum( ( vol1_mean_m[ index_min_1m, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )
loss_2e = sum( ( vol2_mean_e[ index_min_2e, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) )
loss_2m = sum( ( vol2_mean_m[ index_min_2m, i ] - UX2.PX_LAST.values[ i ] )**2 for i in range( max, UX2.index.size ) )
print( 'Minimum loss for Euler scheme, UX1:\t\t', "{:.2f}".format( loss_1e ) )
print( 'Minimum loss for Euler scheme, UX2:\t\t', "{:.2f}".format( loss_2e ) )
print( 'Minimum loss for Milstein scheme, UX1:\t\t', "{:.2f}".format( loss_1m ) )
print( 'Minimum loss for Milstein scheme, UX2:\t\t', "{:.2f}".format( loss_2m ) )
UX1[ 'mean_sim_e' ] = vol1_mean_e[ index_min_1e, : ]
UX2[ 'mean_sim_e' ] = vol2_mean_e[ index_min_2e, : ]
SPX_price[ 'mean_sim1_e' ] = price1_mean_e[ index_min_1e, : ]
SPX_price[ 'mean_sim2_e' ] = price2_mean_e[ index_min_2e, : ]
UX1[ 'mean_sim_m' ] = vol1_mean_m[ index_min_1m, : ]
UX2[ 'mean_sim_m' ] = vol2_mean_m[ index_min_2m, : ]
SPX_price[ 'mean_sim1_m' ] = price1_mean_m[ index_min_1m, : ]
SPX_price[ 'mean_sim2_m' ] = price2_mean_m[ index_min_2m, : ]
UX1[ 'stoc_m' ] = volat1_m[ index_min_1m, :, 0 ]
UX2[ 'stoc_m' ] = volat2_m[ index_min_2m, :, 0 ]
Window used that gives minimum distance with Euler scheme for UX1: 6 Window used that gives minimum distance with Euler scheme for UX2: 12 Window used that gives minimum distance with Milstein scheme for UX1: 6 Window used that gives minimum distance with Milstein scheme for UX2: 12 Minimum loss for Euler scheme, UX1: 9970.06 Minimum loss for Euler scheme, UX2: 7396.89 Minimum loss for Milstein scheme, UX1: 9970.44 Minimum loss for Milstein scheme, UX2: 7398.60
We see that for our task there is no evident difference between the two methods. Milstein scheme is an overkill, because for each point we are just computing the next one with a certain level of stochasticity, the difference is not appreciable.
To find where is the problem we plot the best parameters found: we see that the parameter $\theta$ is sistematically under the squared variance, meaning that our model tries constantly to decrease the value of VIX.
n = index_min_1m
k = np.zeros( shape = ( UX1.index.size - n ) )
sigma = np.zeros( shape = ( UX1.index.size - n ) )
theta = np.zeros( shape = ( UX1.index.size - n ) )
for i in range(n,UX1.index.size):
Vt1 = UX1.PX_LAST.values[i-n:i]**2
b1 = np.abs( ( ( 1 / n**2 * sum( Vt1[i ] for i in range( 1, n ) ) * sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) -
( 1 / n * sum( Vt1[i ] * Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) ) /
( 1 / n**2 * sum( Vt1[i-1] for i in range( 1, n ) ) * sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - 1 ) )
b2 = np.abs( ( 1 / n * sum( Vt1[i ] * Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - b1 ) /
( sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) * ( 1 - b1 ) / n ) )
b3 = np.abs( 1 / n * sum( ( Vt1[i] - Vt1[i-1] * b1 - b2 * ( 1 - b1 )**2 ) / Vt1[ i - 1 ] for i in range( 1, n ) ) )
k[i-n] = - np.log( b1 ) / delta
theta[i-n] = b2
sigma[i-n] = np.sqrt( 2 * k1 * b3 / ( 1 - b1**2 ) )
fig, ax = plt.subplots( ncols=3, figsize=( 24, 10 ), sharex=True )
ax[0].plot( UX1.index.values[n:], theta, label = 'Parameter Theta' )
ax[0].plot( UX1.index.values[n:], UX1.PX_LAST.values[n:]**2, label = 'Variance' )
ax[0].set_title('Long-term variance')
ax[0].set_ylabel('Long-term variance')
ax[0].legend(loc='upper center', ncol=2)
ax[0].set_xlim([date(2018, 1, 1), date(2019, 1, 1)])
ax[0].set_ylim(bottom = 0, top = 1300)
ax[0].grid(ls = 'dotted')
ax[1].plot( UX1.index.values[n:], k )
ax[1].set_title('Reversion rate')
ax[1].set_ylabel('Reversion rate')
ax[1].set_ylim(bottom = 50,top = 350)
ax[1].grid(ls = 'dotted')
ax[2].plot( UX1.index.values[n:], sigma )
ax[2].set_title('Volatility of volatility')
ax[2].set_ylabel('Volatility of volatility')
ax[2].set_ylim(top = 50, bottom=10)
ax[2].grid(ls = 'dotted')
ax[0].grid(ls = 'dotted')
plt.show()
We see now the predicted values for prices and volatility. We immediately see how the prices are very noisy, this is probably due to the non optimal estimation of the parameters.
On the other hand we see that, even if the volatilities are not apparently that much noisy, giving a look at the residuals one can see that the estimation is not always optimal.
fig, ax = plt.subplots( nrows=2, figsize=( 15, 14 ), sharex=True )
ax[0].plot( SPX_price.index, SPX_price['mean_sim1_m'], label='Average of simulations for first future' )
ax[0].plot( SPX_price.index, SPX_price.Close, label='Real price')
ax[0].grid( linestyle='dotted' )
ax[0].legend(loc='upper left')
ax[0].set_title ( 'S&P500 price simulations for the first future' )
ax[0].set_ylabel( 'SPX Price' )
# An inner plot to zoom
axes1 = plt.axes( [0.175, 0.65, 0.2, 0.175] )
axes1.set_title( 'Zoom' )
axes1.plot( SPX_price.index, SPX_price['mean_sim1_m'] )
axes1.plot( SPX_price.index, SPX_price.Close )
axes1.grid( linestyle = 'dotted' )
axes1.set_xlim([date(2018, 1, 1), date(2019, 1, 1)])
axes1.set_ylim(bottom = 2300, top = 3000)
axes1.tick_params(axis='x', labelrotation=45)
ax[1].plot( SPX_price.index, SPX_price['mean_sim2_m'], label='Average of simulations for second future' )
ax[1].plot( SPX_price.index, SPX_price.Close, label='Real price')
ax[1].grid( linestyle='dotted' )
ax[1].legend(loc='upper left')
ax[1].set_title ( 'S&P500 price simulations for the second future' )
ax[1].set_ylabel( 'SPX Price' )
# An inner plot to zoom
axes1 = plt.axes( [0.165, 0.25, 0.2, 0.175] )
axes1.set_title( 'Zoom' )
axes1.plot( SPX_price.index, SPX_price['mean_sim2_m'] )
axes1.plot( SPX_price.index, SPX_price.Close )
axes1.grid( linestyle = 'dotted' )
axes1.set_xlim([date(2018, 1, 1), date(2019, 1, 1)])
axes1.set_ylim(bottom = 2300, top = 3000)
axes1.tick_params(axis='x', labelrotation=45)
fig.autofmt_xdate( )
plt.show()
fig, ax = plt.subplots( nrows=2, figsize=( 15, 10 ), sharex=True )
#ax[0].plot(UX1.index, UX1.PX_LAST, lw=2, label='Real volatility')
for k in range( Npaths ):
ax[0].plot( UX1.index, volat1_m[ index_min_1m, :, k ], label='', lw=1 )
ax[0].grid( linestyle='dotted' )
#ax[0].legend(loc='upper left')
ax[0].set_title ( 'UX1, ' + str( Npaths ) + ' simulations' )
ax[0].set_ylabel( 'Implied volatility UX1' )
ax[1].plot( UX1.index, UX1['mean_sim_m'], label='Average of simulations' )
ax[1].plot( UX1.index, UX1.PX_LAST, label='Real volatility')
ax[1].grid( linestyle='dotted')
ax[1].legend( loc='upper left')
ax[1].set_title ( 'UX1, real data vs average of simulations')
ax[1].set_ylabel( 'Implied volatility UX1')
fig.autofmt_xdate( )
plt.show()
UX1['mean_sim_m'].to_csv('UX1_Heston_Predictions.csv')
fig, ax = plt.subplots(figsize=( 15, 10 ) )
ax.plot( UX1.index, UX1['mean_sim_m'], label='Average of simulations', lw = 3 )
ax.plot( UX1.index, UX1['stoc_m'], label='Random solution', ls = 'dashed' )
ax.plot( UX1.index, UX1.PX_LAST, label='Real volatility', lw = 4)
ax.plot( UX1.index, UX1.PX_LAST.shift(1), label='Shift of real volatility', lw = 4)
ax.grid( linestyle='dotted')
ax.legend( loc='upper left')
ax.vlines( date(2017, 7, 6), ymin=0, ymax=20, ls='dashed' )
ax.vlines( date(2017, 7, 7), ymin=0, ymax=20, ls='dashed' )
ax.set_title ( 'UX1, real data vs average of simulations')
ax.set_ylabel( 'Implied volatility UX1')
ax.set_xlim([date(2017, 6, 1), date(2017, 9, 1)])
ax.set_ylim([9,16])
fig.autofmt_xdate( )
plt.show()
fig, ax = plt.subplots(nrows=2, figsize=(15,10), )
for k in range(Npaths):
ax[0].plot( UX2.index, volat2_m[ index_min_2m, :, k ], label='', lw=1 )
#ax[0].plot(UX2.index, UX2.PX_LAST, lw=5, label='Real volatility')
ax[0].grid(linestyle='dotted')
#ax[0].legend(loc='best')
ax[0].set_title('UX2, ' + str(Npaths) + ' simulations')
ax[0].set_ylabel('Implied volatility UX2')
ax[1].plot(UX2.index, UX2['mean_sim_m'], label='Average of simulations')
ax[1].plot(UX2.index, UX2.PX_LAST, label='Real volatility')
ax[1].grid(linestyle='dotted')
ax[1].legend(loc='best')
ax[1].set_title('UX2, real data vs average of simulations')
ax[1].set_ylabel('Implied volatility UX2')
fig.autofmt_xdate()
plt.show()
fig, ax = plt.subplots(figsize=( 15, 10 ) )
ax.plot( UX2.index, UX2['mean_sim_m'], label='Average of simulations', lw = 3 )
ax.plot( UX2.index, UX2['stoc_m'], label='Random solution', linestyle = 'dashed' )
ax.plot( UX2.index, UX2.PX_LAST, label='Real volatility', lw = 4)
ax.plot( UX2.index, UX2.PX_LAST.shift(1), label='Shift of real volatility', lw = 4)
ax.vlines( date(2014, 8, 7), ymin=0, ymax=20, ls='dashed' )
ax.vlines( date(2014, 8, 8), ymin=0, ymax=20, ls='dashed' )
ax.grid( linestyle='dotted')
ax.legend( loc='upper left')
ax.set_title ( 'UX2, real data vs average of simulations')
ax.set_ylabel( 'Implied volatility UX2')
ax.set_xlim([date(2014, 6, 1), date(2014, 9, 1)])
ax.set_ylim([12.5,17])
fig.autofmt_xdate( )
plt.show()
We give a look to the residuals between the expected values in $t+1$ and the real ones, we expect a gaussian behaviour centered in $0$.
UX1[ 'resid' ] = UX1['mean_sim_m'] - UX1['PX_LAST']
UX2[ 'resid' ] = UX2['mean_sim_m'] - UX2['PX_LAST']
result = stats.linregress(np.arange(6,len(UX1.resid)),
UX1.resid.values[6:])
fit_rs = 'y = {:.3f} + {:.3f} x'.format(result.intercept, result.slope)
print(fit_rs)
min, max, step = 6, len(UX1.resid), 30
x = np.arange(min, max, step)
y = np.empty(0)
erry = np.empty(0)
for i in x:
res_x = UX1.resid.values[ i: i + step ]
y = np.append( y, res_x.mean() )
erry = np.append( erry, res_x.std( ) )
sns.color_palette("muted")
g = sns.jointplot(x = np.arange(6,len(UX1.resid)),
y = UX1.resid.values[6:],
kind = 'reg',
line_kws = {'color':'orange', "label" : fit_rs, "lw" : 3})
g.ax_joint.plot(np.arange(6,len(UX1.resid)),
result.intercept + result.slope*UX1.resid.values[6:],
linewidth = 5, label = "Linregress", color = 'r',
alpha = 0.5 )
g.ax_joint.plot(np.arange(6,len(UX1.resid)), np.zeros(shape=(len(UX1.resid)-6)),
ls = 'dashed',
linewidth = 5, label = "Expected residual", color = 'orange',
alpha = 1 )
g.ax_joint.errorbar(x, y, yerr = erry, fmt = 'ro', linestyle = 'none', label = "Error bars")
g.ax_marg_x.remove()
g.set_axis_labels('#day', 'Residual', fontsize=14)
g.ax_joint.legend(loc = "upper right", fontsize = 15)
g.fig.set_figwidth(15)
g.fig.set_figheight(10)
y = -1.281 + 0.000 x
result = stats.linregress(np.arange(12,len(UX2.resid)),
UX2.resid.values[12:])
fit_rs = 'y = {:.3f} + {:.3f} x'.format(result.intercept, result.slope)
print(fit_rs)
min, max, step = 12, len(UX2.resid), 30
x = np.arange(min, max, step)
y = np.empty(0)
erry = np.empty(0)
for i in x:
res_x = UX2.resid.values[ i: i + step ]
y = np.append( y, res_x.mean() )
erry = np.append(erry, res_x.std( ) )
sns.color_palette("muted")
g = sns.jointplot(x = np.arange(12,len(UX2.resid)),
y = UX2.resid.values[12:],
kind = 'reg',
line_kws = {'color':'orange', "label" : fit_rs, "lw" : 3})
g.ax_joint.plot(np.arange(12,len(UX2.resid)),
result.intercept + result.slope*UX2.resid.values[12:],
linewidth = 5, label = "Linregress", color = 'r',
alpha = 0.5 )
g.ax_joint.plot(np.arange(12,len(UX2.resid)), np.zeros(shape=(len(UX2.resid)-12)),
ls = 'dashed',
linewidth = 5, label = "Expected residual", color = 'orange',
alpha = 1 )
g.ax_joint.errorbar(x, y, yerr = erry, fmt = 'ro', linestyle = 'none', label = "Error bars")
g.ax_marg_x.remove()
g.set_axis_labels('#day', 'Residual', fontsize=14)
g.ax_joint.legend(loc = "upper right", fontsize = 15)
g.fig.set_figwidth(15)
g.fig.set_figheight(10)
y = -0.702 + 0.000 x
b = np.histogram_bin_edges(UX1.resid.dropna(), bins='fd')
fig, ax = plt.subplots(figsize=(15, 10))
entries, edges, _ = ax.hist(UX1.resid, bins=b)
ax.set_ylabel('Counts')
ax.set_xlabel("Normally distibuted values")
ax.yaxis.set_major_locator(MultipleLocator(20))
ax.xaxis.set_major_locator(MultipleLocator( 2))
ax.tick_params(which='major', width=1.0)
ax.tick_params(which='major', length=10)
ax.grid(linestyle='dotted', alpha=0.5)
# calculate bin centers
bin_centers = 0.5 * (edges[:-1] + edges[1:])
# draw errobars, use the sqrt error. You can use what you want there
# poissonian 1 sigma intervals would make more sense
ax.errorbar(bin_centers, entries, yerr=np.sqrt(entries), fmt='o')
def gaussian(x, N, mu, sig):
return N * np.exp(-np.power(x - mu, 2.) / (2 * np.power(sig, 2.)))
popt, pcov = optimize.curve_fit(gaussian, bin_centers, entries, p0 = [100,0, 1])
ax.plot(bin_centers, gaussian(bin_centers, popt[0],popt[1],popt[2]), linestyle='dashed',c = 'r', lw=3 )
ssr = np.sum((entries - gaussian(bin_centers, popt[0],popt[1],popt[2]))**2)
tss = np.sum((np.mean(entries) - entries)**2)
rsq = 1 - ssr / tss
sigma_y = np.sqrt(np.abs(entries))
mask_err = (sigma_y!=0)
offsetUX1 = popt[1]
UX1.mean_sim_m -= offsetUX1
print("N \t\t\t=\t\t", popt[0], "\t\t+/-\t\t", pcov[0,0]**.5)
print("mu \t\t\t=\t\t", popt[1], "\t\t+/-\t\t", pcov[1,1]**.5)
print("std\t\t\t=\t\t", popt[2], "\t\t+/-\t\t", pcov[2,2]**.5)
print("\n")
print("R2 \t\t\t=\t\t", rsq, "\nR \t\t\t=\t\t", np.sqrt(rsq))
print("\nCovariance matrix:\n", np.corrcoef(bin_centers,entries)) # check with the correlation matrix that R is the correlation coefficient
ndof = len(bin_centers[mask_err]) - 3
print("\nn degrees of freedom \t=\t\t", ndof)
# calculate the chi^2
chi2 = np.sum(((entries[mask_err] - (gaussian(bin_centers[mask_err], popt[0],popt[1],popt[2])))**2) / sigma_y[mask_err]**2)
print("\nchi2 \t\t\t=\t\t", chi2)
# calculate the p-value from the chi^2, the n.d.o.f., and the comulative chi^2 distribution
pvalue = 1. - stats.chi2.cdf(chi2, ndof)
print("p-value \t\t=\t\t", pvalue) # if the p-value is < 0.05, the fit is considered unsatisfactory
N = 240.61628045476658 +/- 4.240415688731638 mu = -0.9621835175000198 +/- 0.012576207514057768 std = 0.6179923109380179 +/- 0.012576207382752367 R2 = 0.9528929795659218 R = 0.9761623735659564 Covariance matrix: [[1. 0.07020391] [0.07020391 1. ]] n degrees of freedom = 82 chi2 = 453.662025130365 p-value = 0.0
b = np.histogram_bin_edges(UX2.resid.dropna(), bins='fd')
fig, ax = plt.subplots(figsize=(15, 10))
entries, edges, _ = ax.hist(UX2.resid, bins=b)
ax.set_ylabel('Counts')
ax.set_xlabel("Normally distibuted values")
ax.yaxis.set_major_locator(MultipleLocator(20))
ax.xaxis.set_major_locator(MultipleLocator( 2))
ax.tick_params(which='major', width=1.0)
ax.tick_params(which='major', length=10)
ax.grid(linestyle='dotted', alpha=0.5)
# calculate bin centers
bin_centers = 0.5 * (edges[:-1] + edges[1:])
# draw errobars, use the sqrt error. You can use what you want there
# poissonian 1 sigma intervals would make more sense
ax.errorbar(bin_centers, entries, yerr=np.sqrt(entries), fmt='o')
def gaussian(x, N, mu, sig):
return N * np.exp(-np.power(x - mu, 2.) / (2 * np.power(sig, 2.)))
popt, pcov = optimize.curve_fit(gaussian, bin_centers, entries, p0 = [100,0, 1])
ax.plot(bin_centers, gaussian(bin_centers, popt[0],popt[1],popt[2]), linestyle='dashed',c = 'r', lw=3 )
ssr = np.sum((entries - gaussian(bin_centers, popt[0],popt[1],popt[2]))**2)
tss = np.sum((np.mean(entries) - entries)**2)
rsq = 1 - ssr / tss
sigma_y = np.sqrt(np.abs(entries))
mask_err = (sigma_y!=0)
offsetUX2 = popt[1]
UX2.mean_sim_m -= offsetUX2
print("N \t\t\t=\t\t", popt[0], "\t\t+/-\t\t", pcov[0,0]**.5)
print("mu \t\t\t=\t\t", popt[1], "\t\t+/-\t\t", pcov[1,1]**.5)
print("std\t\t\t=\t\t", popt[2], "\t\t+/-\t\t", pcov[2,2]**.5)
print("\n")
print("R2 \t\t\t=\t\t", rsq, "\nR \t\t\t=\t\t", np.sqrt(rsq))
print("\nCovariance matrix:\n", np.corrcoef(bin_centers,entries)) # check with the correlation matrix that R is the correlation coefficient
ndof = len(bin_centers[mask_err]) - 3
print("\nn degrees of freedom \t=\t\t", ndof)
# calculate the chi^2
chi2 = np.sum(((entries[mask_err] - (gaussian(bin_centers[mask_err], popt[0],popt[1],popt[2])))**2) / sigma_y[mask_err]**2)
print("\nchi2 \t\t\t=\t\t", chi2)
# calculate the p-value from the chi^2, the n.d.o.f., and the comulative chi^2 distribution
pvalue = 1. - stats.chi2.cdf(chi2, ndof)
print("p-value \t\t=\t\t", pvalue) # if the p-value is < 0.05, the fit is considered unsatisfactory
N = 225.29416816752598 +/- 2.369863015571822 mu = -0.40851958877480377 +/- 0.007797801337236014 std = 0.6419801645133866 +/- 0.007797801233116348 R2 = 0.9756825587535507 R = 0.9877664494978308 Covariance matrix: [[ 1. -0.06234497] [-0.06234497 1. ]] n degrees of freedom = 92 chi2 = 326.88135539360684 p-value = 0.0
We see a systematic underestimation for the volatility: we decide to use it as a general offset of the expected volatilities.
We try also to find the intradays values of volatility. For simplicity we divide each open-market day in 12 parts, considering 9 hours of markets open (in reality markets are open around $6-8.5$ hours).
steps = 12
min = 3
max = 30
delta = 1 / 252 / steps
volat1 = np.zeros( shape = ( max - min, UX1.index.size * steps, Npaths ) )
volat2 = np.zeros( shape = ( max - min, UX2.index.size * steps, Npaths ) )
vol1_mean = np.zeros( shape = ( max - min, UX1.index.size * steps ) )
vol2_mean = np.zeros( shape = ( max - min, UX2.index.size * steps ) )
for n in window:
print( '... processing window', n, '...' )
prices1 = np.empty( ( ( SPX_price.index.size - n ) * steps, Npaths ) )
prices2 = np.empty( ( ( SPX_price.index.size - n ) * steps, Npaths ) )
vol1 = np.empty( ( ( UX1.index.size - n ) * steps, Npaths ) )
vol2 = np.empty( ( ( UX2.index.size - n ) * steps, Npaths ) )
for i in range(n,UX1.index.size):
Vt1 = UX1.PX_LAST.values[ i-n:i]**2
Vt2 = UX2.PX_LAST.values[ i-n:i]**2
S = SPX_price.Close.values[ i-n:i]
V10 = Vt1[-1]
V20 = Vt2[-1]
S0 = S[-1]
b1 = np.abs( ( ( 1 / n**2 * sum( Vt1[i ] for i in range( 1, n ) ) * sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) -
( 1 / n * sum( Vt1[i ] * Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) ) /
( 1 / n**2 * sum( Vt1[i-1] for i in range( 1, n ) ) * sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - 1 ) )
b2 = np.abs( ( 1 / n * sum( Vt1[i ] * Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - b1 ) /
( sum( Vt1[ i - 1 ]**( -1 ) for i in range( 1, n ) ) * ( 1 - b1 ) / n ) )
b3 = np.abs( 1 / n * sum( ( Vt1[i] - Vt1[i-1] * b1 - b2 * ( 1 - b1 )**2 ) / Vt1[ i - 1 ] for i in range( 1, n ) ) )
k1 = - np.log( b1 ) / delta
theta1 = b2
sigma1 = np.sqrt( 2 * k1 * b3 / ( 1 - b1**2 ) )
b1 = np.abs( ( ( 1 / n**2 * sum( Vt2[i ] for i in range( 1, n ) ) * sum( Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) -
( 1 / n * sum( Vt2[i ] * Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) ) ) /
( 1 / n**2 * sum( Vt2[i-1] for i in range( 1, n ) ) * sum( Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - 1 ) )
b2 = np.abs( ( 1 / n * sum( Vt2[i ] * Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) - b1 ) /
( sum( Vt2[ i - 1 ]**( -1 ) for i in range( 1, n ) ) * ( 1 - b1 ) / n ) )
b3 = np.abs( 1 / n * sum( ( Vt2[i] - Vt2[i-1] * b1 - b2 * ( 1 - b1 )**2 ) / Vt2[ i - 1 ] for i in range( 1, n ) ) )
k2 = - np.log( b1 ) / delta
theta2 = b2
sigma2 = np.sqrt( 2 * k2 * b3 / ( 1 - b1**2 ) )
_, v1 = heston_pde_milstein( S0, r, k1, theta1, V10, rho, sigma1, steps, Npaths )
_, v2 = heston_pde_milstein( S0, r, k2, theta2, V20, rho, sigma2, steps, Npaths )
vol1[ i*steps-n*steps:i*steps-n*steps+steps,:] = v1[:].T
vol2[ i*steps-n*steps:i*steps-n*steps+steps,:] = v2[:].T
vol1 = np.insert( vol1, 0, np.full( ( n * steps, Npaths ), np.nan ), axis = 0 )
vol2 = np.insert( vol2, 0, np.full( ( n * steps, Npaths ), np.nan ), axis = 0 )
volat1[ n - min, :, : ] = np.sqrt( vol1 )
volat2[ n - min, :, : ] = np.sqrt( vol2 )
vol1_mean[ n - min, : ] = np.sqrt( vol1.mean( axis = 1 ) )
vol2_mean[ n - min, : ] = np.sqrt( vol2.mean( axis = 1 ) )
... processing window 3 ... ... processing window 4 ... ... processing window 5 ... ... processing window 6 ... ... processing window 7 ... ... processing window 8 ... ... processing window 9 ... ... processing window 10 ... ... processing window 11 ... ... processing window 12 ... ... processing window 13 ... ... processing window 14 ... ... processing window 15 ... ... processing window 16 ... ... processing window 17 ... ... processing window 18 ... ... processing window 19 ... ... processing window 20 ... ... processing window 21 ... ... processing window 22 ... ... processing window 23 ... ... processing window 24 ... ... processing window 25 ... ... processing window 26 ... ... processing window 27 ... ... processing window 28 ... ... processing window 29 ...
fig, ax = plt.subplots( figsize=( 15, 7 ) )
ax.plot( np.arange( min, max ), sum( ( vol1_mean[ :, i * steps ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) ), label='Loss with intradays evaluation' )
ax.plot( np.arange( min, max ), sum( ( vol1_mean_m[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )[0:max-min], label='Loss day by day' )
ax.set_xlabel('# Window used')
ax.set_ylabel('Total residual')
ax.grid(linestyle='dotted')
ax.set_title('Residual as function of points used')
ax.set_title( 'Loss as function of #days used for calibration' )
ax.legend(loc = 'lower right')
# An inner plot to zoom
axes = plt.axes( [ 0.275, 0.425, 0.3, 0.4 ] )
axes.set_title( 'Zoom' )
axes.plot( np.arange( 5, 9 ), sum( ( vol1_mean[ :, i * steps ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )[5-min:9-min] )
axes.plot( np.arange( 5, 9 ), sum( ( vol1_mean_m[ :, i ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )[5-min:9-min] )
axes.grid( linestyle = 'dotted' )
axes.set_ylim( [9750,12000] )
axes.set_xlim( [5,8] )
axes.xaxis.set_major_locator(MultipleLocator(1))
plt.show()
loss_intr = sum( ( vol1_mean[ 7, i * steps ] - UX1.PX_LAST.values[ i ] )**2 for i in range( max, UX1.index.size ) )
print( 'Minimum loss for day by day scheme, UX1:\t\t', "{:.2f}".format( loss_1m ) )
print( 'Minimum loss for intradays scheme, UX1:\t\t', "{:.2f}".format( loss_intr ) )
Minimum loss for day by day scheme, UX1: 9970.44 Minimum loss for intradays scheme, UX1: 10577.97
The results are the same. We should use even the intradays real data.
index = UX1.index
new_index = []
for i in index:
new_index.append(i)
for j in range(11):
minutes = 45
hours_added = datetime.timedelta(minutes = minutes)
future_date_and_time = i + hours_added
new_index.append(future_date_and_time)
del max
UX1_s = pd.DataFrame( vol1_mean[index_min_1m,:], index=new_index, columns=['Value'] )
UX1_s.Value -= offsetUX1
UX1_s['stoc'] = volat1[index_min_1m,:,0].ravel() - offsetUX1
fig, ax = plt.subplots( figsize = ( 15, 10 ) )
UX1.mean_sim_m.plot(ax = ax, lw=5, label='Mean value day by day simulation')
UX1_s.stoc.plot(ax = ax, lw= 2, ls='dashdot', label = 'Random simulation with intradays predictions' )
UX1.PX_LAST.plot(ax = ax, ls = 'dotted', label='Real price')
UX1_s.Value.plot(ax = ax, ls = 'dashed', lw=3, label = 'Mean value of simulations with intradays predictions')
ax.set_xlim(['2021-01-01','2021-06-01'])
ax.set_ylim([18,33])
ax.grid(ls='dotted')
ax.legend()
ax.set_ylabel('Volatility')
ax.set_xlabel('')
ax.set_title('Volatility - Predictions vs Real')
plt.show()
The first strategy consists in opening a short position if the predicted VIX for $t+1$ is less than today's and not closing it untill the predicted VIX for $t+1$ increases. This first strategy has a great risk.
The second strategy consists in opening a short position each time one see the predicted VIX for $t+1$ is less than today's and closing it the next day. This second strategy should be more conservative.
UX1['diff_sim'] = UX1['PX_LAST'] - UX1['mean_sim_m'].shift(-1)
conditions = [ ( UX1['diff_sim'] > 0 ),
( UX1['diff_sim'] <= 0 ) ]
choices = ['open','close']
UX1['open/close'] = np.select( conditions, choices, default = 0 )
conditions = [ ( UX1['open/close'] == 'open' ) & ( UX1['open/close'].shift(1) == 'close' ),
( UX1['open/close'] == 'close' ) & ( UX1['open/close'].shift(1) == 'open' ) ]
choices = ['to_open','to_close']
UX1['open/close'] = np.select( conditions, choices, default = UX1['open/close'] )
UX1['diff_sim_real'] = UX1['PX_LAST'] - UX1['PX_LAST'].shift(-1)
conditions = [ ( UX1['diff_sim_real'] > 0 ),
( UX1['diff_sim_real'] <= 0 ) ]
choices = ['open','close']
UX1['open/close_real'] = np.select( conditions, choices, default = 0 )
conditions = [ ( ( UX1['open/close_real'] == 'open' ) & ( ( UX1['open/close_real'].shift(1) == 'close' ) | ( UX1['open/close_real'].shift(1) == 0 ) ) ),
( ( UX1['open/close_real'] == 'close' ) & ( ( UX1['open/close_real'].shift(1) == 'open' ) | ( UX1['open/close_real'].shift(1) == 0 ) ) ) ]
choices = ['to_open','to_close']
UX1['open/close_real'] = np.select( conditions, choices, default = UX1['open/close_real'] )
UX1['open/close_diff'] = 'correspond'
UX1['open/close_diff'].loc[ UX1['open/close'] != UX1['open/close_real'] ] = 'not correspond'
print( 'Total values processed:\t\t\t\t\t', len(UX1))
print( 'Values predicted that correspond to the real ones:\t', len(UX1[UX1['open/close'] == UX1['open/close_real'].shift(-1)]))
Total values processed: 2964 Values predicted that correspond to the real ones: 874
fig, ax = plt.subplots(figsize = ( 8, 7 ))
y = UX1['open/close_diff'].value_counts()
labels = ['not correspond', 'correspond']
xticks = [1,2]
ax.bar(xticks,y, width = 0.5, align="center")
ax.set_title('Agreement between real values and predicted ones', fontsize = 16)
ax.set_xticks(xticks)
ax.set_xticklabels(labels)
fig.tight_layout()
fig, ax = plt.subplots(figsize = ( 8, 7 ))
y = UX1['open/close'].value_counts()
labels = ['open', 'close', 'to_close', 'to _open', '0']
xticks = [1,2, 3, 4, 5]
ax.bar(xticks,y, width = 0.5, align="center")
ax.set_title('Agreement between real values and predicted ones', fontsize = 16)
ax.set_xticks(xticks)
ax.set_xticklabels(labels)
fig.tight_layout()
money = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money[0] = 1000
i = 0
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['open/close'].iloc[x] == 'to_open':
money[x] = money[x-1]
i += 1
elif UX1['open/close'].iloc[x] == 'open':
money[x] = money[x-1]
i += 1
elif UX1['open/close'].iloc[x] == 'to_close':
money[x] = money[x-1] + 100 * ( ( UX1['PX_LAST'].iloc[x-i] - UX1['PX_LAST'].iloc[x] ) / UX1['PX_LAST'].iloc[x-i] )
i = 0
else:
money[x] = money[x-1]
money_real = np.zeros(shape=(len(UX1['PX_LAST'])))
money_real[0] = 1000
i = 0
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['open/close_real'].iloc[x] == 'to_open':
money_real[x] = money_real[x-1]
i += 1
elif UX1['open/close_real'].iloc[x] == 'open':
money_real[x] = money_real[x-1]
i += 1
elif UX1['open/close_real'].iloc[x] == 'to_close':
money_real[x] = money_real[x-1] + 100 * ( ( UX1['PX_LAST'].iloc[x-i] - UX1['PX_LAST'].iloc[x] ) / UX1['PX_LAST'].iloc[x-i] )
i = 0
else:
money_real[x] = money_real[x-1]
UX1['Money_STR1a'] = money
fig, ax = plt.subplots( figsize = ( 12, 7 ) )
UX1.Money_STR1a.plot( ax = ax )
ax.set_title( 'Money gained with the first trading strategy' )
ax.set_xlabel( '' )
ax.set_ylabel( 'Money gained' )
ax.grid( linestyle = 'dotted' )
ax.annotate( "COVID crisis", xy = ( '2020-03-15', 1180 ), xytext = ( '2020-06-01', 1150 ),
arrowprops = dict( arrowstyle = "->" ) )
plt.show()
UX1['Money_Real_STR1a'] = money_real
fig, ax = plt.subplots( figsize = ( 12, 7 ) )
UX1.Money_Real_STR1a.plot( ax = ax )
ax.set_title( 'Money gained with the first trading strategy' )
ax.set_xlabel( '' )
ax.set_ylabel( 'Money gained' )
ax.grid( linestyle = 'dotted' )
#ax.annotate( "COVID crisis", xy = ( '2020-03-15', 975 ), xytext = ( '2020-06-01', 1050 ),
# arrowprops = dict( arrowstyle = "->" ) )
plt.show()
Instead of investing the same value based only on the prediction
$$ VIX[t] - VIX[t+1] < 0 $$we decide to use a coefficient which will be multiplied to the constant arbitrary value. The coefficient should mesure the confidence we have in our bet: we chose the value
$$ \gamma = VIX_{real}[t] - VIX_{pred}[t+1] $$In this way the bet gain will be
$$ m[i+1] = m[i] + 100\gamma\frac{VIX_{real}[t]-VIX_{real}[t+1]}{VIX_{real}[t]} $$Beside this we decide to change even the constant part: we can chose to divide money into several brackets given by numbers between two powers of 10, i.e. if we have 2000 we are in the 1000 bracket. Anyway to have something more efficient and less arbitrary we decide to mix up the factors as follows
$$ m[i+1] = m[i] + \frac{m[i]}{d}\cdot \tanh\gamma \cdot \frac{VIX_{real}[t]-VIX_{real}[t+1]}{VIX_{real}[t]} $$with d greater for $VIX_{real}[t]<20$ to take into account the mean-revertingness of volatility.
money = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money[0] = 1000
inv_eff = np.array([])
i = 0
investment = []
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['PX_LAST'].iloc[x] > 20:
d = 10
else:
d = 30
if UX1['open/close'].iloc[x] == 'to_open': # open the position, put a fraction as investment
new_investment = money[ x - 1 ] / d * np.tanh( UX1['diff_sim'].iloc[x] / d )
money[x] = money[ x - 1 ]
investment.append( new_investment )
i += 1
elif UX1['open/close'].iloc[x] == 'open': # mantain the position
new_investment = money[ x - 1 ] / d * np.tanh( UX1['diff_sim'].iloc[x] / d )
money[x] = money[ x - 1 ]
investment.append( new_investment )
i += 1
elif UX1['open/close'].iloc[x] == 'to_close': # close the position
inv_eff = np.append( inv_eff, sum( investment ) )
gain = 0
for j in range( len( investment ) ):
gain += investment[j] * ( ( UX1['PX_LAST'].iloc[ x - ( len( investment ) - j ) ] - UX1['PX_LAST'].iloc[x] ) / UX1['PX_LAST'].iloc[ x - ( len( investment ) - j ) ] )
money[x] = money[x-1] + gain
i = 0
investment = []
else:
money[x] = money[x-1]
if money[x] < 1 and ( UX1['open/close'].iloc[x] == 'to_close' or
UX1['open/close'].iloc[x] == 'close' ):
money[x:] = money[x]
break
money_real = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money_real[0] = 1000
inv_eff = np.array( [] )
i = 0
investment = []
for x in range(1, len(UX1['PX_LAST'])):
if UX1['PX_LAST'].iloc[x] > 20:
d = 10
else:
d = 30
if UX1['open/close_real'].iloc[x] == 'to_open': # open the position, put a fraction as investment
new_investment = money_real[ x - 1 ] / d * np.tanh( UX1['diff_sim_real'].iloc[x] / d )
money_real[x] = money_real[ x - 1 ] - new_investment
investment.append( new_investment )
i += 1
elif UX1['open/close_real'].iloc[x] == 'open': # mantain the position
new_investment = money_real[ x - 1 ] / d * np.tanh( UX1['diff_sim_real'].iloc[x] / d )
money_real[x] = money_real[ x - 1 ] - new_investment
investment.append( new_investment )
i += 1
elif UX1['open/close_real'].iloc[x] == 'to_close': # close the position
inv_eff = np.append( inv_eff, sum( investment ) )
gain = 0
for j in range( len( investment ) ):
gain += investment[j] * ( 1 + ( UX1['PX_LAST'].iloc[ x - ( len( investment ) - j ) ] - UX1['PX_LAST'].iloc[x] ) / UX1['PX_LAST'].iloc[ x - ( len( investment ) - j ) ] )
money_real[x] = money_real[x-1] + gain
i = 0
investment = []
else:
money_real[x] = money_real[x-1]
if money_real[x] < 1:
money_real[x:] = money_real[x]
break
UX1['Money_STR1b'] = money
fig, ax = plt.subplots( figsize = ( 12, 7 ) )
UX1.Money_STR1b.plot( ax = ax )
ax.set_title( 'Money gained with the first trading strategy' )
ax.set_xlabel( '' )
ax.set_ylabel( 'Money gained' )
ax.grid( linestyle = 'dotted' )
ax.annotate( "COVID crisis", xy = ( '2020-03-01', 1200 ), xytext = ( '2020-12-01', 1250 ),
arrowprops = dict( arrowstyle = "->" ) )
plt.show()
conditions = [ ( UX1['diff_sim'] > 0 ),
( UX1['diff_sim'] <= 0 ) ]
choices = ['open','close']
UX1['open/close_2'] = np.select( conditions, choices, default = 0 )
conditions = [ ( UX1['diff_sim_real'] > 0 ),
( UX1['diff_sim_real'] <= 0 ) ]
choices = ['open','close']
UX1['open/close_real_2'] = np.select( conditions, choices, default = 0 )
money_2 = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money_2[0] = 1000
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['open/close_2'].iloc[x] == 'open':
money_2[x] = money_2[x-1] + 100 * ( ( UX1['PX_LAST'].iloc[x] - UX1['PX_LAST'].iloc[x+1] ) / UX1['PX_LAST'].iloc[x] )
else:
money_2[x] = money_2[x-1]
money_real_2 = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money_real_2[0] = 1000
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['open/close_real_2'].iloc[x] == 'open':
money_real_2[x] = money_real_2[x-1] + 100 * ( ( UX1['PX_LAST'].iloc[x] - UX1['PX_LAST'].iloc[x+1] ) / UX1['PX_LAST'].iloc[x] )
else:
money_real_2[x] = money_real_2[x-1]
UX1['Money_STR2a'] = money_2
fig, ax = plt.subplots( figsize = ( 12, 7 ) )
UX1.Money_STR2a.plot( ax = ax )
ax.set_title( 'Money gained with the second trading strategy - Constant investment' )
ax.set_xlabel( '' )
ax.set_ylabel( 'Money gained' )
ax.grid( linestyle = 'dotted' )
ax.annotate( "COVID crisis", xy = ( '2020-03-15', 1050 ), xytext = ( '2020-06-01', 1100 ),
arrowprops = dict( arrowstyle = "->" ) )
plt.show()
money_2 = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money_2[0] = 1000
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['PX_LAST'].iloc[x] > 20:
d1 = 1.5
d2 = 15
else:
d1 = 15
d2 = 1.5
if UX1['open/close_2'].iloc[x] == 'open':
money_2[x] = money_2[x-1] + money_2[x-1] / d1 * np.tanh( UX1['diff_sim'].iloc[x] / d1 ) * ( ( UX1['PX_LAST'].iloc[x ] - UX1['PX_LAST'].iloc[x+1] ) / UX1['PX_LAST'].iloc[x] )
elif UX1['open/close_2'].iloc[x] == 'close':
money_2[x] = money_2[x-1] + money_2[x-1] / d2 * np.tanh( - UX1['diff_sim'].iloc[x] / d2 ) * ( ( UX1['PX_LAST'].iloc[x+1] - UX1['PX_LAST'].iloc[x ] ) / UX1['PX_LAST'].iloc[x] )
else:
money_2[x] = money_2[x-1]
money_real_2 = np.zeros( shape = ( len( UX1['PX_LAST'] ) ) )
money_real_2[0] = 1000
for x in range( 1, len( UX1['PX_LAST'] ) ):
if UX1['PX_LAST'].iloc[x] > 20:
d1 = 1.5
d2 = 15
else:
d1 = 15
d2 = 1.5
if UX1['open/close_real_2'].iloc[x] == 'open':
money_real_2[x] = money_real_2[x-1] + money_real_2[x-1] / d1 * np.tanh( UX1['diff_sim_real'].iloc[x] / d1 ) * ( ( UX1['PX_LAST'].iloc[x ] - UX1['PX_LAST'].iloc[x+1] ) / UX1['PX_LAST'].iloc[x] )
elif UX1['open/close_real_2'].iloc[x] == 'close':
money_real_2[x] = money_real_2[x-1] + money_real_2[x-1] / d2 * np.tanh( - UX1['diff_sim_real'].iloc[x] / d2 ) * ( ( UX1['PX_LAST'].iloc[x+1] - UX1['PX_LAST'].iloc[x ] ) / UX1['PX_LAST'].iloc[x] )
else:
money_real_2[x] = money_real_2[x-1]
UX1['Money_STR2b'] = money_2
fig, ax = plt.subplots( figsize = ( 12, 7 ) )
UX1.Money_STR2b.plot( ax = ax, label = 'Money' )
ax.set_title( 'Money gained with the second trading strategy - Variable investment' )
ax.set_xlabel( '' )
ax.set_ylabel( 'Money gained' )
ax.grid( linestyle = 'dotted' )
ax.legend( loc = 'upper left' )
ax.annotate('European Debt Crisis', xy = ( '2011-02-01', 1200 ), xytext = ( '2011-02-01', 1500 ),
ha='center', va='center',
arrowprops=dict(arrowstyle='-[, widthB=4, lengthB=2' ) )
ax.annotate( "COVID crisis", xy = ( '2020-02-20', 4000 ), xytext = ( '2020-08-01', 4800 ),
arrowprops = dict( arrowstyle = "->" ) )
plt.show()
fig, ax1 = plt.subplots(figsize=(12, 10))
color = 'tab:red'
ax1.set_xlabel('')
ax1.set_ylabel('Money gained', color=color)
UX1.Money_STR2b.plot( ax = ax1, label = 'Money', c = color, lw = 3 )
ax1.tick_params(axis='y', labelcolor=color)
ax1.grid(linestyle='dotted',color=color)
ax1.xaxis.set_major_formatter(mdates.DateFormatter("%b %y"))
ax1.xaxis.set_minor_formatter(mdates.DateFormatter("%b %y"))
ax1.set_title('Money gained and VIX value', fontsize=14)
#plt.rcParams['font.size'] = '8'
ax1.legend(loc='upper left')
ax2 = ax1.twinx()
color = 'tab:blue'
ax2.set_ylabel('VIX value', color=color)
UX1.PX_LAST.plot( ax = ax2, label = 'VIX value', color = color )
ax2.tick_params(axis='y', labelcolor=color)
#ax2.xaxis.set_major_locator(mdates.MonthLocator(interval=1))
ax2.legend(loc='upper left', bbox_to_anchor=(0, 0.95))
ax2.grid(linestyle='dotted', color=color)
fig.autofmt_xdate()
fig.tight_layout() # otherwise the right y-label is slightly clipped
plt.show()
UX1['Money_Daily'] = UX1.Money_STR2b - UX1.Money_STR2b.shift( 1 )
# delta normalized bw money[t] and money[t-1]
# UX1
delta_11 = (UX1['Money_STR1b'] - UX1['Money_STR1b'].shift(-1)) / UX1['Money_STR1b']
delta_12 = (UX1['Money_STR2b'] - UX1['Money_STR2b'].shift(-1)) / UX1['Money_STR2b']
fig,ax = plt.subplots(figsize=(15,5), sharex=True)
ax.plot( delta_12, label = '% money wrt the day before second strategy' )
ax.plot( delta_11, label = '% money wrt the day before first strategy' )
ax.set_title( '% money gained wrt the day before with first and second trading strategy - Variable investment-UX1' )
ax.set_xlabel( 'Years' )
ax.set_ylabel( '%' )
ax.grid( linestyle = 'dotted' )
ax.legend( loc = 'upper left' )
<matplotlib.legend.Legend at 0x7f6e629a1dc0>
fig,ax = plt.subplots(figsize=(15,5), sharex=True)
ax.plot( UX1['Money_STR1b'].rolling(20).std(),
label = 'std on each month second strategy' )
ax.plot( UX1['Money_STR2b'].rolling(20).std(),
label = 'Std on each month first strategy' )
ax.set_title( 'std on each month with first and second trading strategy - Variable investment-UX1' )
ax.set_xlabel( 'Years' )
ax.set_ylabel( '%' )
ax.grid( linestyle = 'dotted' )
ax.legend( loc = 'upper left' )
<matplotlib.legend.Legend at 0x7f6e629207c0>
Now we repete the previous analysis for the second future data (UX2) using the second strategy:
UX2['diff_sim'] = UX2['PX_LAST'] - UX2['mean_sim_m'].shift(-1)
conditions = [ ( UX2['diff_sim'] > 0 ),
( UX2['diff_sim'] <= 0 ) ]
choices = ['open','close']
UX2['open/close_2'] = np.select( conditions, choices, default = 0 )
UX2['diff_sim_real'] = UX2['PX_LAST'] - UX2['PX_LAST'].shift(-1)
conditions = [ ( UX2['diff_sim_real'] > 0 ),
( UX2['diff_sim_real'] <= 0 ) ]
choices = ['open','close']
UX2['open/close_real_2'] = np.select( conditions, choices, default = 0 )
money_2 = np.zeros( shape = ( len( UX2['PX_LAST'] ) ) )
money_2[0] = 1000
for x in range( 1, len( UX2['PX_LAST'] ) ):
if UX2['PX_LAST'].iloc[x] > 20:
d1 = 1.5
d2 = 15
else:
d1 = 15
d2 = 1.5
if UX2['open/close_2'].iloc[x] == 'open':
money_2[x] = money_2[x-1] + money_2[x-1] / d1 * np.tanh( UX2['diff_sim'].iloc[x] / d1 ) * ( ( UX2['PX_LAST'].iloc[x ] - UX2['PX_LAST'].iloc[x+1] ) / UX2['PX_LAST'].iloc[x] )
elif UX2['open/close_2'].iloc[x] == 'close':
money_2[x] = money_2[x-1] + money_2[x-1] / d2 * np.tanh( - UX2['diff_sim'].iloc[x] / d2 ) * ( ( UX2['PX_LAST'].iloc[x+1] - UX2['PX_LAST'].iloc[x ] ) / UX2['PX_LAST'].iloc[x] )
else:
money_2[x] = money_2[x-1]
money_real_2 = np.zeros( shape = ( len( UX2['PX_LAST'] ) ) )
money_real_2[0] = 1000
for x in range( 1, len( UX2['PX_LAST'] ) ):
if UX2['PX_LAST'].iloc[x] > 20:
d1 = 1.5
d2 = 15
else:
d1 = 15
d2 = 1.5
if UX2['open/close_real_2'].iloc[x] == 'open':
money_real_2[x] = money_real_2[x-1] + money_real_2[x-1] / d1 * np.tanh( UX2['diff_sim_real'].iloc[x] / d1 ) * ( ( UX2['PX_LAST'].iloc[x ] - UX2['PX_LAST'].iloc[x+1] ) / UX2['PX_LAST'].iloc[x] )
elif UX2['open/close_real_2'].iloc[x] == 'close':
money_real_2[x] = money_real_2[x-1] + money_real_2[x-1] / d2 * np.tanh( - UX2['diff_sim_real'].iloc[x] / d2 ) * ( ( UX2['PX_LAST'].iloc[x+1] - UX2['PX_LAST'].iloc[x ] ) / UX2['PX_LAST'].iloc[x] )
else:
money_real_2[x] = money_real_2[x-1]
UX2['Money_STR2b'] = money_2
fig, ax = plt.subplots( figsize = ( 12, 7 ) )
UX2.Money_STR2b.plot( ax = ax )
ax.set_title( 'Money gained with the second trading strategy - Variable investment' )
ax.set_xlabel( '' )
ax.set_ylabel( 'Money gained' )
ax.grid( linestyle = 'dotted' )
ax.annotate('European Debt Crisis', xy = ( '2011-02-01', 1200 ), xytext = ( '2011-02-01', 1500 ),
ha='center', va='center',
arrowprops=dict(arrowstyle='-[, widthB=4, lengthB=2' ) )
ax.annotate( "COVID crisis", xy = ( '2020-02-20', 2800 ), xytext = ( '2020-10-01', 2600 ),
arrowprops = dict( arrowstyle = "->" ) )
plt.show()
# delta normalized bw money[t] and money[t-1]
# UX2
delta_22 = (UX2['Money_STR2b'] - UX2['Money_STR2b'].shift(-1)) / UX2['Money_STR2b']
fig,ax = plt.subplots(figsize=(15,5), sharex=True)
ax.plot( delta_22, label = '% money wrt the day before second strategy' )
#ax[1].plot( delta_11, label = '% money wrt the day before first strategy' )
ax.set_title( '% money gained wrt the day before with second trading strategy - Variable investment-UX2' )
ax.set_xlabel( 'Years' )
ax.set_ylabel( '%' )
ax.grid( linestyle = 'dotted' )
ax.legend( loc = 'upper left' )
<matplotlib.legend.Legend at 0x7f6e627152e0>
fig,ax = plt.subplots(figsize=(15,5), sharex=True)
ax.plot( UX2['Money_STR2b'].rolling(20).std(),
label = 'std on each month second strategy' )
#ax[1].plot( delta_11, label = '% money wrt the day before first strategy' )
ax.set_title( 'Std on each month with second trading strategy - Variable investment-UX2' )
ax.set_xlabel( 'Years' )
ax.set_ylabel( '%' )
ax.grid( linestyle = 'dotted' )
ax.legend( loc = 'upper left' )
<matplotlib.legend.Legend at 0x7f6e626e0b20>
In this section, we try to apply machine learning methods to predict tomorrow's VIX given the VIX of today. First, we start Heston ERM. We try to find the optimal parameters which lead to the lowest value for our 5-dimensional loss function. Additionally, we use neural networks as a non-physical experiment to predict tomorrow's volatility.
So far, we have used the Heston model for predicting tomorrow's VIX given the volatility of today. Heston model contains several parameters which up to now, have been estimated using deterministic Maximum-likelihood (MLE) methods. In this section, we still keep the Heston model but try an alternative method for estimating its parameters. Our new framework concerns the realm of machine learning. We split the data into training and test set, build a squared loss function using Heston as the hypothesis and real values of tomorrow's volatility as labels. Minimization of the loss function is carried out using Scipy library and also manually-implemented ADAM optimizer.
"""Loading the Data"""
dir = 'data/'
file_name = 'grid1_zheb51fo.xlsx'
UX1 = pd.read_excel(dir+file_name, sheet_name='UX1_Index')
UX2 = pd.read_excel(dir+file_name, sheet_name='UX2_Index')
UX1 = UX1.set_index('Date')
UX2 = UX2.set_index('Date')
UX1.dropna(subset = ["PX_LAST"], inplace=True) #Getting rid of NaN values
UX2.dropna(subset = ["PX_LAST"], inplace=True)
UX1.sort_index(inplace=True)
UX2.sort_index(inplace=True)
dataset = np.array(UX1.PX_LAST)
# M = 100 #M times magnification
# dataset= dataset * M
"""Specifying the Input & Output (Labels)"""
n=0 #Looking at n previous days, 0 for Heston predictor
X= [[dataset[j] for j in range(i, i+n+1)] for i in range(len(dataset) - n - 1)]
Y= [dataset[i+n+1] for i in range(len(dataset) - n -1)]
"""Splitting Data into Train and Test set"""
m_training= 2000
m_test= 1000
X_training_heston= X[:m_training]
Y_training_heston= Y[:m_training]
X_test_heston= X[m_training:m_training+m_test]
Y_test_heston= Y[m_training:m_training+m_test]
if n==0:
X_training_heston = np.ravel(X_training_heston)
X_test_heston = np.ravel(X_test_heston)
Y_training_heston = np.array(Y_training_heston)
Y_test_heston = np.array(Y_test_heston)
We consider a squared loss function as:
$$ {l}_{Heston} = ({h}_{Heston}({x}_{i}) - {y}_{i}) ^ 2 $$$$ {L}_{Heston} = {1/m} {\sum\limits_{i = 1} ^ {m}{({h}_{Heston} - {y}_{i}) ^ 2}} $$where :
$ {h}_{Heston}$ : volatility of tomorrow calculated via Heston pde
$ {x}_{i}$: volatility of today
$ {y}_{i}$: volatility of tomorrow
$ {m} $ : number of training samples
our task is to calculate :
$$ {argmin}_{W} {({L}_{Heston})}$$where :
W : Heston's pde parameters
"""Building the Hypothesis"""
def heston_pde_milstein(V0, k, theta, rho, sigma):
WT = np.sqrt( 1 ) * np.random.multivariate_normal(np.array([0, 0]), np.array([[1, rho], [rho, 1]]))[1]
V1 = np.abs(V0+ k * (theta - V0) * 1 + sigma * np.sqrt(V0) * WT + .25 * sigma**2 * (WT**2 - 1))
return V1
"""Building the Loss Function"""
#The difference between real label and the predicted one to the power of 2
#l = (heston_pde_milstein(X_training_heston[i], r, k, theta, rho, sigma) - Y_training_heston[i])**2
m=len(X_training_heston) #Training set size
# k: x[0], theta:x[1], rho:x[2], sigma:x[3]
def Ls(X):
def heston_inner_func(i): #calculates the predicted lable for each training sample
WT = np.random.multivariate_normal(np.array([0, 0]), np.array([[1, X[2]], [X[2], 1]]))[1]
V1 = np.abs(X_training_heston[i] + X[0] * (X[1] - X_training_heston[i]) * 1 +
X[3] * np.sqrt(X_training_heston[i]) * WT + .25 * X[3]**2 * (WT**2 - 1))
return V1
Ls = (1/m) * np.sum(np.array([(heston_inner_func(i) - Y_training_heston[i])**2 for i in range(m)]))
return Ls
"""ERM: Empirical Risk Minimization, Using Scipy"""
result = minimize(Ls, (0.1, 1.5, 3, 1))
print("Best Paramteres:", result.x)
print("Minimum Ls:", result.fun)
Best Paramteres: [0.10004866 1.50002489 3.00000222 0.99999779] Minimum Ls: 56.969678315954226
Adam optimizer is an extended version of stochastic gradient descent method which is based on adaptive estimation of first-order and second-order moments. Adam combines the benefits of AdaGrad and RMSProp algorithms.
"""ERM: Empirical Risk Minimization, Using Adam optimizer"""
def grad(params):
m=len(X_training_heston) #Training set size
def Ls(X):
def heston_inner_func(i): #calculates the predicted lable for each training sample
WT = np.random.multivariate_normal(np.array([0, 0]), np.array([[1, X[0]], [X[0], 1]]))[1]
V1 = np.abs(X_training_heston[i] + X[1] * (X[2] - X_training_heston[i]) * 1 +
X[3] * np.sqrt(X_training_heston[i]) * WT + .25 * X[3]**2 * (WT**2 - 1))
return V1
Ls = (1/m) * np.sum(np.array([(heston_inner_func(i) - Y_training_heston[i])**2 for i in range(m)]))
return Ls
G = nd.Gradient(Ls)([params[0], params[1], params[2], params[3]])
return G
def adams(grad, init, n_epochs=500, eta=10**-4, gamma=0.9, beta=0.99,epsilon=10**-8,noise_strength=0):
params=np.array(init)
param_traj=np.zeros([n_epochs+1,4])
param_traj[0,]=init
v=0;
grad_sq=0;
for j in range(n_epochs):
noise=noise_strength*np.random.randn(params.size)
g=np.array(grad(params))+noise
v=gamma*v+(1-gamma)*g
grad_sq=beta*grad_sq+(1-beta)*g*g
v_hat=v/(1-gamma)
grad_sq_hat=grad_sq/(1-beta)
params=params-eta*np.divide(v_hat,np.sqrt(grad_sq_hat+epsilon))
param_traj[j+1,]=params
return param_traj
Nsteps=10
lr_l=0.001
init1 = np.array([-0.7,0.5,0,1.5])
adam_traj=adams(grad,init1,Nsteps,eta=lr_l, gamma=0.9, beta=0.99,epsilon=10**-8,noise_strength=0)
minimum = adam_traj[-1]
print(minimum)
[-0.6979262 0.48233689 0.003327 1.48600835]
"""Specifying the Input & Output (Labels)"""
n=0 #Looking at n previous days, 0 for Heston predictor
X= [[dataset[j] for j in range(i, i+n+1)] for i in range(len(dataset) - n - 1)]
Y= [dataset[i+n+1] for i in range(len(dataset) - n -1)]
"""Splitting Data into Train and Test set"""
m_training_red= 300 #reduced because of high computational time
m_test_red= 100
X_training_heston_red= X[:m_training_red]
Y_training_heston_red= Y[:m_training_red]
X_test_heston_red= X[m_training_red:m_training_red+m_test_red]
Y_test_heston_red= Y[m_training_red:m_training_red+m_test_red]
if n==0:
X_training_heston_red = np.ravel(X_training_heston_red)
X_test_heston_red = np.ravel(X_test_heston_red)
Y_training_heston_red = np.array(Y_training_heston_red)
Y_test_heston_red = np.array(Y_test_heston_red)
m_red=len(X_training_heston_red) #Training set size
"""Qualitative Analysis of the Loss Function"""
#In this section we try to plot Loss fucntion(with reduced dimensions) to see what's going on
def Ls(X, Y, kappa, rho):
def heston_inner_func(i): #calculates the predicted lable for each training sample
WT = np.random.multivariate_normal(np.array([0, 0]), np.array([[1, rho], [rho, 1]]))[1]
V1 = np.abs(X_training_heston_red[i] + kappa * (X - X_training_heston_red[i]) * 1 +
Y * np.sqrt(X_training_heston_red[i]) * WT + .25 * Y**2 * (WT**2 - 1))
return V1
Ls = (1/m_red) * np.sum(np.array([(heston_inner_func(i) - Y_training_heston_red[i])**2 for i in range(m_red)]))
return Ls
Z=[[] for i in range(12)]
for x in np.linspace(-4,4,8):
for y in np.linspace(-4,4,8):
Z[0].append(Ls(x,y,0, -0.7))
Z[1].append(Ls(x,y,0.2, -0.7 ))
Z[2].append(Ls(x,y,0.7, -0.7 ))
Z[3].append(Ls(x,y,0, -0.2))
Z[4].append(Ls(x,y,0.2, -0.2))
Z[5].append(Ls(x,y,0.7, -0.2))
Z[6].append(Ls(x,y,0, 0.2))
Z[7].append(Ls(x,y,0.2,0.2))
Z[8].append(Ls(x,y,0.7, 0.2))
Z[9].append(Ls(x,y ,0 ,0.7))
Z[10].append(Ls(x,y,0.2 ,0.7))
Z[11].append(Ls(x,y,0.7 ,0.7))
Since the Heston model contains 4 parameters, the corresponding loss function becomes a 5-dimensional function. To have a grasp on what is going on we try to consider 2 parameters changing and the other two as constant, to be able to plot in 3 dimensions.
"""Plotting the Loss Function in 3D"""
x = np.linspace(-4,4,8)
y = np.linspace(-4,4,8)
X,Y = np.meshgrid(x,y)
X = np.ravel(X)
Y = np.ravel(Y)
# fig, ax1 = plt.subplots( subplot_kw={"projection": "3d"}, figsize=(10,10))
fig, ([ax1, ax2], [ax3, ax4]) = plt.subplots(2,2, subplot_kw={"projection": "3d"}, figsize=(15,15))
ax1.plot_trisurf(X,Y,Z[0])
ax1.plot_trisurf(X,Y,Z[1], label = "-0.5")
ax1.plot_trisurf(X,Y,Z[2], label = "0")
ax1.set_title('rho = -0.5', color ='black', fontsize='xx-large')
ax1.set_xlabel('Sigma', fontsize='x-large')
ax1.set_ylabel('Theta', fontsize='x-large')
ax1.set_zlabel('Ls', fontsize='x-large')
# ax.legend(loc="best")
ax1.set_facecolor("gray")
ax2.plot_trisurf(X,Y,Z[3], label = "0.5")
ax2.plot_trisurf(X,Y,Z[4], label = "-0.5")
ax2.plot_trisurf(X,Y,Z[5], label = "0")
ax2.set_title('rho=-0.1', color ='black', fontsize='xx-large')
ax2.set_xlabel('Sigma', fontsize='x-large')
ax2.set_ylabel('Theta', fontsize='x-large')
ax2.set_zlabel('Ls', fontsize='x-large')
# ax.legend(loc="best")
ax2.set_facecolor("gray")
ax3.plot_trisurf(X,Y,Z[6], label = "0.5")
ax3.plot_trisurf(X,Y,Z[7], label = "-0.5")
ax3.plot_trisurf(X,Y,Z[8], label = "0")
ax3.set_title('rho= 0', color ='black', fontsize='xx-large')
ax3.set_xlabel('Sigma', fontsize='x-large')
ax3.set_ylabel('Theta', fontsize='x-large')
ax3.set_zlabel('Ls', fontsize='x-large')
# ax.legend(loc="best")
ax3.set_facecolor("gray")
ax4.plot_trisurf(X,Y,Z[9], label = "0.5")
ax4.plot_trisurf(X,Y,Z[10], label = "-0.5")
ax4.plot_trisurf(X,Y,Z[11], label = "0")
ax4.set_title('rho=0.5', color ='black', fontsize='xx-large')
ax4.set_xlabel('Sigma', fontsize='x-large')
ax4.set_ylabel('Theta', fontsize='x-large')
ax4.set_zlabel('Ls', fontsize='x-large')
# ax.legend(loc="best")
ax4.set_facecolor("gray")
fig.suptitle('Heston 5-d Loss Funcion divided in 3-d seperate plots',color='black', fontsize=20)
plt.show()
"""Plotting the Loss Function in 3D, Interactive Mode_1"""
# %matplotlib notebook
# from mpl_toolkits.mplot3d import axes3d
x = np.linspace(-4,4,8)
y = np.linspace(-4,4,8)
X,Y = np.meshgrid(x,y)
X = np.ravel(X)
Y = np.ravel(Y)
fig, ax0 = plt.subplots( subplot_kw={"projection": "3d"}, figsize=(10,10))
ax0.plot_trisurf(X,Y,Z[7])
ax0.plot_trisurf(X,Y,Z[10], label = "-0.5")
ax0.plot_trisurf(X,Y,Z[11], label = "0")
ax0.set_title('rho=0', color ='black', fontsize='xx-large')
ax0.set_xlabel('Sigma', fontsize='x-large')
ax0.set_ylabel('Theta', fontsize='x-large')
ax0.set_zlabel('Ls', fontsize='x-large')
fig.suptitle('Heston Loss Function as a function of Sigma and Theta',color='black', fontsize=20)
# ax.legend(loc="best")
ax0.set_facecolor("gray")
"""Plotting the Loss Function in 3D, Interactive Mode_2"""
def Ls(X, Y, theta, rho):
def heston_inner_func(i): #calculates the predicted lable for each training sample
WT = np.random.multivariate_normal(np.array([0, 0]), np.array([[1, rho], [rho, 1]]))[1]
V1 = np.abs(X_training_heston[i] + X * (theta - X_training_heston[i]) * 1 +
Y * np.sqrt(X_training_heston[i]) * WT + .25 * Y**2 * (WT**2 - 1))
return V1
Ls = (1/m) * np.sum(np.array([(heston_inner_func(i) - Y_training_heston[i])**2 for i in range(m)]))
return Ls
Z=[]
for x in np.linspace(-6,6,24):
for y in np.linspace(-6,6,24):
Z.append(Ls(x,y, 0, 0.5))
x = np.linspace(-6,6,24)
y = np.linspace(-6,6,24)
X,Y = np.meshgrid(x,y)
X = np.ravel(X)
Y = np.ravel(Y)
fig, ax0 = plt.subplots( subplot_kw={"projection": "3d"}, figsize=(10,10))
ax0.plot_trisurf(X,Y,Z, color='paleturquoise')
ax0.set_title('rho=0.5, theta = 0', color ='black', fontsize='x-large')
ax0.set_xlabel('kappa', fontsize='x-large')
ax0.set_ylabel('sigma', fontsize='x-large')
ax0.set_zlabel('Ls')
ax0.set_title("Heston Loss Function as a function of of Kappa and Sigma, rho=0.5, theta=0.5",
fontsize='x-large')
# ax.legend(loc="best")
ax0.set_facecolor("gray")
"""Best paramteres according to qualitative observatinon"""
# k, theta, rho, sigma = 0.1, 0, 0.5, 0.5
k, theta, rho, sigma = 0.1, 0, 0.5, 0.1
Using parameters obtained through the loss minimization task and plugging them into Heston PDE, we calculate the true error on the test set.
"""Evaluating the model on test Set"""
# k, theta, rho, sigma = best_params[0], best_params[1], best_params[2], best_params[3]
Y_pred = np.array([heston_pde_milstein(X_test_heston[i], k, theta, rho, sigma) for i in range(len(X_test_heston))])
Heston_ERM_pred = np.ravel(Y_pred)
#True Error
Ld = (1/len(Y_test_heston)) * np.sum((Heston_ERM_pred - Y_test_heston)**2)
print("The True error is:", Ld)
The True error is: 7.183024220651337
The true error seems pretty low but is this result valid? In our problem, we are predicting the VIX of tomorrow based on the volatility of today. The label set is the input set shifted one to the left. This indeed leads to a huge similarity between the input and label set. Due to this fact, even in the case in which we are getting high scores or low error for the performance of the model on the test set, we are not sure whether it's because of similarity between input and labels or the model is truly predicting tomorrow's VIX. To find out about this we plot not only the prediction of tomorrow and the true values of tomorrow's volatility but also the input set which are the values of today's volatility.
"""Investigating the validity of result"""
plt.style.use('dark_background')
plt.figure(figsize=(14,7))
plt.plot([t for t in range(len(Y_pred))], Y_test_heston, label = "Tomorrow Volatility")
plt.plot([t for t in range(len(Y_pred))], X_test_heston, label = "Today Volatility")
plt.plot([t for t in range(len(Y_pred))], Y_pred, label = "Prediction of Tomorrow")
plt.title("Prediction of Tomorrow Volatility Using Qualitative Method for Parameter Estimation",
fontsize='x-large')
# plt.axline((20,X_test_heston.mean()), (20, X_test_heston.max()), color="red",
# label = 'Cutoff frequency', linestyle=(0, (5, 5)))
# plt.axline((40, X_test_heston.mean()), (40, X_test_heston.max()),
# color="red", label = 'Cutoff frequency', linestyle=(0, (5, 5)))
plt.xlabel("Time", fontsize='x-large')
plt.ylabel("Volatility", fontsize='x-large')
plt.legend(loc= 'best', fontsize = 'xx-large')
a,b= 600,620
axes = plt.axes( [ 0.2, 0.5, 0.2, 0.3 ] )
axes.plot( np.arange( a , b ), Y_test_heston[a:b], label = "Y_test")
axes.plot( np.arange( a , b ), X_test_heston[a:b], label = "X_test")
axes.plot( np.arange( a , b ), Y_pred[a:b], label = "Y_pred")
# axes.axline((40, X_test_heston.mean()), (40, X_test_heston.max()),
# color="red", label = 'Cutoff frequency', linestyle=(0, (5, 5)))
# axes.legend()
axes.grid( linestyle = 'dotted' )
"""What's the probelm?"""
plt.style.use('dark_background')
plt.figure(figsize=(14,7))
plt.plot([t for t in range(len(Y_pred))], Y_test_heston, label = "Tomorrow Voalatility")
plt.plot([t for t in range(len(Y_pred))], X_test_heston, label = "Today Volatility")
plt.plot([t for t in range(len(Y_pred))], Y_pred, label = "Prediction of Tomorrow")
plt.axline((612,X_test_heston.mean()), (612, X_test_heston.max()), color="red",
linestyle=(0, (5, 5)))
plt.title("Zooming into Prediction of Tomorrow Volatility Using Qualitative Method for Parameter Estimation"
,fontsize='x-large')
plt.xlabel("Time", fontsize='x-large')
plt.ylabel("Volatility", fontsize='x-large')
plt.xlim(600,620)
plt.ylim(20,40)
plt.legend(loc='best',fontsize = 'xx-large')
<matplotlib.legend.Legend at 0x7f6edf6f61f0>
Now we can confirm that our doubt about the validity of result was unfotunately true. The reason we are getting low true error and high scores with the model is not beacuse we are properly predicting the future but due to the fact that the predictinos are merely a translation of today's volatility and since today's and yomorrow's volatilities are pretty colse to eachother, the differnece between predictions of tomorrow's volatility and real values of tomorrow's volatility becomes low. This indeed leads to low values for true error.
We realized that the low true error we obtained with the model are somehow misleading which in turn, are due to the clossness of today's and tomorrow's VIX. But let's do some qunatification to better illustrate this problem.
Distance_from_tomorrow = np.abs(Y_pred - Y_test_heston)
Distance_from_today = np.abs(Y_pred - X_test_heston)
print("Distance to Today:\n" , Distance_from_today.mean())
print("\nDistance to Tommorow:\n", Distance_from_tomorrow.mean())
plt.figure(figsize=(14,7))
plt.plot([t for t in range(len(Y_pred))], Distance_from_tomorrow, color= 'red', label='Dist. to Today')
plt.plot([t for t in range(len(Y_pred))], Distance_from_today, color= 'black', label='Dist. to tomorrow')
plt.axline((580,Distance_from_tomorrow.mean()**2), (620, Distance_from_tomorrow.mean()**2), color="red",
linestyle=(0, (5, 5)), label='Average of Dist. to Tomorrow')
plt.axline((580,Distance_from_today.mean()**2), (620,Distance_from_today.mean()**2), color="black",
linestyle=(0, (5, 5)), label='Average of Dist. to Today')
plt.title("Distances between predicted volatlity and tomorrow/today volatilty", fontsize='xx-large')
plt.legend(loc='best', fontsize='large')
plt.xlabel('Time', fontsize= 'x-large')
plt.ylabel('Distance', fontsize='x-large')
plt.xlim(580,620)
plt.ylim(-3,10)
Distance to Today: 2.1036446438452026 Distance to Tommorow: 2.151392201614155
(-3.0, 10.0)
In this section, we use neural networks as an alternative approach to predict VIX
Neural Networks are inspired by the human brain. They consist of various nodes which imitate neural cells and edges which corresponds to the synaptic connection between each pair of neurons.
Image('figures/NNN.jpg')
These neurons are connected in a rather sequential fashion. They are connected by weights and each weight shows the strength of a connection between two neurons. All these connected neurons are organized into different layers which are categorized into the input layer, hidden layers, and output layer. The input layer's responsibility is to collect and receive the data which is quite similar to the output layer whose job is to just output the labels. However, hidden layers do more than that, they process the data and come up with an idea about the relations between inputs and outputs and try to imitate these relations to predict the labels for new datasets. As a result, This collaboration between neurons and layers constructs our Neural Network.
A Neural Network's activities boil down to the activity of its neurons. Therefore, we scrutinize the neuron's operations to have a better comprehension of Neural networks in general. Every neuron represents a linear predictor alongside a specific activation function. having had all these neurons with different linear predictors, the Neural Network itself is going to process the data in a non_linear fashion.
1.linear function: is basically linear combanation of outputs of previous layers and their weights.
2.activation function: these non_linear functions change the data from linear functions, and make them meaning for the next layer.
#Let's, just for a momemnt, Ignore any pre_knowledge about the subject and see what happens
#Don't forget to run the first cell, just the first one
n=0 #Looking at n previous days to estimate paramteres + today's volatility
X= [[dataset[j] for j in range(i, i+n+1)] for i in range(len(dataset) - n - 1)]
Y= [dataset[i+n+1] for i in range(len(dataset) - n -1)]
# Putting more emphasis on today's data:f
emphasis = 1 #repearing today's value emphasis times
for i in range(len(X)):
for j in range(emphasis-1):
X[i].append(X[i][-1])
m_training= 2000
m_test= 1000
X_training_NN = X [:m_training]
Y_training_NN = Y [:m_training]
X_test_NN = X [m_training:m_training+m_test]
Y_test_NN = Y [m_training:m_training+m_test]
#Let's scale the Volatilities:
M = 1 #M times magnification
#We introduce M as the magnifying parameter to enlarge the difference between
#VIX of today and tomorrow, Nevertheles, changing M does not lead to paramount difference
X_training_NN = np.array(X_training_NN) * M
Y_training_NN = np.array(Y_training_NN) * M
X_test_NN = np.array(X_test_NN) * M
Y_test_NN = np.array(Y_test_NN) * M
Here we introduce the optimizers and architecture of our model. on the top of this, we also tunned some parameters.
NN_R = MLPRegressor(hidden_layer_sizes=(20, 20, 20), max_iter=1000,
alpha=1e-4, solver='adam', momentum=0.9,
activation='relu', tol=1e-4, learning_rate_init=0.01)
NN_R.fit(X_training_NN, Y_training_NN)
NN_pred = NN_R.predict(X_test_NN)
print("\nThe score:", NN_R.score(X_test_NN, Y_test_NN))
The score: 0.9599060087130856
As we can see the score when applying the model on test set is obtained pretty high! But probably similar to the previous case we are dealing with misleading numbers. Let's plot the related graphs to see what is going on.
plt.style.use('dark_background')
plt.figure(figsize=(14,7))
plt.plot([t for t in range(len(Y_test_NN))], Y_test_NN,color='gold', marker='.', label='Tomorrow Volatility')
plt.plot([t for t in range(len(Y_test_NN))], NN_pred,color='b', marker='.', label='Prediction of Tomorrow')
plt.plot([t for t in range(len(Y_test_NN))],
[X_test_NN[i][-1] for i in range(len(Y_test_NN))], color='r', marker='.',
label='Today Volatilty')
plt.title("Prediction of Tomorrow Volatility Using Neural Network", fontsize='xx-large')
plt.xlabel("Time", fontsize='x-large')
plt.ylabel("Volatility", fontsize='x-large')
plt.xlim(400,450)
plt.ylim(12,22)
plt.legend(loc='best', fontsize = 'xx-large')
<matplotlib.legend.Legend at 0x7f6ec066da60>
Distance_from_tomorrow = np.sum(np.abs(NN_pred - Y_test_NN))
Distance_from_today = np.sum(np.abs(np.array(
NN_pred - [X_test_NN[i][-1] for i in range(len(list(Y_test_NN)))])))
print("Distance to Today:\n" , Distance_from_today)
print("\nDistance to Tommorow:\n", Distance_from_tomorrow)
Distance to Today: 90.29142378071421 Distance to Tommorow: 797.6795147373366
#Let's Investigate the behaviour of Distance w.r to n and emphasis
#Just Run the fist Cell before running this cell
D_today = [] #The list to save all values
D_tomorrow = []
for emphasis in [3]:
Distance_from_today_list = []
Distance_from_tomorrow_list = []
for n in range(0,50,5):
X= [[dataset[j] for j in range(i, i+n+1)] for i in range(len(dataset) - n - 1)]
Y= [dataset[i+n+1] for i in range(len(dataset) - n -1)]
for i in range(len(X)):
for j in range(emphasis-1):
X[i].append(X[i][-1])
m_training = 2000
m_test = 1000
X_training = X [:m_training]
Y_training = Y [:m_training]
X_test = X [m_training: m_training + m_test]
Y_test = Y [m_training: m_training + m_test]
NN_R.fit(X_training, Y_training)
Y_pred = NN_R.predict(X_test)
Distance_from_tomorrow = np.sum(np.abs(Y_pred - Y_test))
Distance_from_today = np.sum(np.abs(np.array(
Y_pred - [X_test[i][-1] for i in range(len(list(Y_test)))])))
Distance_from_today_list.append(Distance_from_today)
Distance_from_tomorrow_list.append(Distance_from_tomorrow)
D_today.append(Distance_from_today_list)
D_tomorrow.append(Distance_from_tomorrow_list)
#Normalization:
D_today= np.array(D_today)/(max(np.array(D_today).max(),np.array(D_tomorrow).max()))
D_tomorrow= np.array(D_tomorrow)/(max(np.array(D_today).max(), np.array(D_tomorrow).max()))
# Plotting the Result of previous part
plt.figure(figsize=(14,7))
# plt.plot([t for t in range(L)], Distance_from_today_list, )
plt.plot(np.array([t for t in range(len(D_tomorrow[0]))]) *5, D_tomorrow[0],
marker="o", label = 'Volatility Distance to Tomorrow')
plt.plot(np.array([t for t in range(len(D_today[0]))]) *5, D_today[0],
marker="o", label = 'Volatlity Distance to Today')
plt.title("Relative Distance of Predicted Volatility to Today/Tomorrow Volatility", fontsize='xx-large')
plt.ylabel('Distance', fontsize='x-large')
plt.xlabel('n', fontsize='x-large')
plt.legend(loc='best', fontsize = 'x-large')
plt.show()
So far we have investigated three different methods to predict the VIX of tomorrow based on today's VIX. In this section, we plot the predictions of all three methods in one figure.
#Load the MLE paramter estimatin prediction
MLE_pred = pd.read_csv('UX1_Heston_Predictions.csv').mean_sim_m
time_slice = 900
plt.figure(figsize=(16,8))
plt.plot([t for t in range(time_slice)], MLE_pred[1941:1941+time_slice],label= "MLE prediction")
plt.plot([t for t in range(time_slice)], NN_pred[:time_slice], label="Neural network prediction")
plt.plot([t for t in range(time_slice)], Heston_ERM_pred[:time_slice],label="Heston_ERM prediction")
plt.plot([t for t in range(time_slice)],
Y_test_NN[:time_slice], color ='r', #y_test is same for NN or heston or...
label= "Tomorrow Volatility")
plt.title("Comparison between Three methods used for Volatility prediction", fontsize='xx-large')
plt.ylabel("Volatility", fontsize='x-large')
plt.xlabel('Time', fontsize='x-large')
plt.legend(loc='best', fontsize = 'xx-large')
a,b=200, 220
axes = plt.axes( [ 0.15, 0.55, 0.31, 0.3 ] )
axes.plot( np.arange( a , b ), MLE_pred[1941+a : 1941+b] ,label = "MLE")
axes.plot( np.arange( a , b ), NN_pred[a : b], label = "NN")
axes.plot( np.arange( a , b ), Heston_ERM_pred[a:b], label = "Heston")
axes.plot( np.arange( a , b ), Y_test_NN[a:b], color = 'red', label = "True Vol.")
# axes.axline((40, X_test.mean()), (40, X_test.max()),
# color="red", label = 'Cutoff frequency', linestyle=(0, (5, 5)))
axes.legend(loc='best')
axes.grid( linestyle = 'dotted' )
start=400
time_slice = 50
plt.figure(figsize=(16,8))
plt.plot([t for t in range(time_slice)], MLE_pred[1941+start:1941+start+time_slice],label= "MLE prediction")
plt.plot([t for t in range(time_slice)], NN_pred[start:start+time_slice], label="Neural network prediction")
plt.plot([t for t in range(time_slice)], Heston_ERM_pred[start:start+
time_slice],label="Heston_ERM prediction")
plt.plot([t for t in range(time_slice)],
Y_test_NN[start:start+ time_slice], color ='r', #y_test is same for NN or heston or...
label= "Tomorrow Volatility")
plt.plot([t for t in range(time_slice)], X_test_NN[start:start+ time_slice], color= 'black',
ls= 'dashed', label= "Today Volatility")
plt.title("Zooming into Comparison between Three methods used for Volatility Prediction",
fontsize='xx-large')
plt.ylabel("Volatility", fontsize='x-large')
plt.xlabel('Time', fontsize='x-large')
plt.legend(loc='best', fontsize = 'xx-large')
<matplotlib.legend.Legend at 0x7f6ec084aca0>
Our main goal was to predict tomorrow's VIX given today's volatility. We performed three different methods. These methods were maximum-likelihood, Heston ERM, and neural networks. we showed that the problem of translation occurs inevitably in all three cases. In other words, the predictions are more a mere translation of today's volatility rather than an authentic prediction of tomorrow's VIX Concerning parameter estimation, we suggest that a stronger estimate of parameters probably will lead to better results. But when it comes to machine learning things become more complicated. Since in our case, the input (volatility of today) and the outputs (volatility of tomorrow) are similar to each other, it seems that the squared loss function loses its ability to function properly. If there is in which ERM would work properly, it is probably assuming some kind of loss function that at the same time minimizes that distance between predicted labels and true labels, maximizes the distance between the predicted labels and the input.